data&donuts
  • Data & Donuts (thinky thoughts)
  • Time to make the donuts...
  • COLLABORATor
  • donuts (quick nibbles)
  • Data talks, people mumble
  • Cancer: The Brand
  • For writers only...
  • Writing from the heart workshop
  • Diabetes: the brand
    • Educational strategies in cholesterol management
  • datamonger.health
  • Data Literacy Workshops

The fifth lane of data science: storytelling

2/4/2019

 
If you are using data in any capacity I recommend listening to Data Framed. I always manage to learn a little something and also be sort of a data voyeur. The podcast reliably features working data scientists willing to share their process and data culture.
I think it's imperative that as many people know about how to handle their own personal data, as well as how to handle data within their own domain. I think every domain uses data in some aspect. Either you're collecting it, either you're manipulating it, or you're using it. It doesn't matter what discipline you're in. You need to know how to manage it.
​

You need to know when you have enough data to make sound decision. You need to know when you don't have enough data to make sound decisions. You have to know how to ask questions. It's about being curious. It's about trying to understand what that data is trying to tell you, not what you're trying to force the data to tell you.--Brandeis Marshall, Associate Professor of Computer Science at Spelman College
I especially like the consideration of data science as five lanes. Data collection and cleaning is the first lane followed by storage and management, analysis, visualization, and storytelling. Brandeis reminds us that even if you manage the first 4 lanes--if the story doesn't make sense--it all falls flat.
Depending on your data origin story you may have different experiences along the way but I began as a medical writer. I think of medical writing and data integration as having an explanatory role. A client would have a complex clinical study or complicated research question requiring an explanation. Here is the data--please tell the "so what" story that will make an audience care.

Now as I lean closer to applied data science in my educational and professional experience, I am more often than not, assuming an exploratory role. Ground zero is now formulating a question and discovering if we have the data to answer. 

​Simple statistics can tell you the shape of your data, variance, correlations, clusters--and provide a visual of where to go next.
Inspiration is cheap, but rigor is expensive. And if you are not willing to pay up, don't expect that there is some magical formula that's going to give it to you.--Cassie Kozyrkov
Where do you get the skills? Workshops are expensive and time consuming. I was fortunate and able to step away to take refresher courses in data analytics but that isn't for everyone.

​The conversation from Data Framed actually estimated 10s of thousands of dollars and two-week bootcamps running from 9 to 5 every day. Not exactly what I would consider true "access" for those of us wanting to skill up.
Picture
We are developing a series of data literacy workshops. Simply a few hours of instruction on a variety of topics--beginning with needs assessments. It may sound specific but what is a needs assessment if not a process for identifying a question.

How do you formulate questions, identify proper data sets, access, collect, manipulate, use and operationalize?

What are the outcomes you hope to measure? How should they be measured? Can they be measured accurately? What are the best types of survey questions? How do I analyze the data?

​Register at the link Continuing Medical Education: change isn't necessary, survival is not mandatory.

Folks that have already registered have received instruction on downloading open source software, a free coupon to background resources and a few introductory sample sets to manipulate before the April 11th workshop date.

Stay tuned for archived workshops in data governance, Python for healthcare, R programming for journalists, and a wide variety of topics requested by you...


Picture

Public Use Microdata Samples (PUMS) of US Slave Population

2/3/2019

 
Picture

There are many data sources freely available once you achieve a certain level of data literacy and graphicacy. IPUMS is a curated collection of census and survey data critical for comparative research of individuals, families, or communities.

One of the advantages of IPUMS is the weighted data provided. For example, in a given data sample, each singular person in a "flat" or unweighted file represents 100 people.

​In the weighted files, individuals are not represented as 100 people but their actual representation in the general population is provided.

The Tableau story below is being created to help guide an audience through data exploration using visualization. It is a rough depiction of how visualizing data can be instrumental in generating hypothesis.

In honor of Black History month, I visualized data from the 1850s and 1860s regarding the US slave population. Once the presentation is live I will be happy to post--but for now you can view the story as a stand alone collection of visualizations meant to suggest opportunities for additional exploration.
​

Scroll across captions to advance images...

Power to the pupil...

1/15/2019

 
Picture
The post JP Morgan Healthcare Conference buzz is changing. Many are asking if the juice is worth the squeeze. Everything costs more leading up to and including the week the healthcare conference is in town. Mind you this is already layered onto the normal exponentially higher costs of anything in San Francisco.

If you want to sit down and have a coffee and conversation anywhere within a feasible radius of the epicenter of the conference--there are now table fees. Everyone wants a piece of the proverbial pie--even if the biotech world isn't even on their radar--there is money to be had and gosh darn it--they want some of it.

​This hit my inbox this morning and also seems a little tone deaf. The summary statement is the following:

$14.6B of venture funding pumped through digital health in 2018, making it the most-funded year since we started tracking the market. But while growth has become the norm, trends suggest that this is just the first inning of a very long game.
Picture
Download the full report if you are interested but a quick glance points to the revenue streams possible by funding digital "health".

Douglas Rushkoff suggests that the pivot from connecting people on social media toward getting people out of the way is intentional and a bellwether for a different economy.

Look no further than the role of technologic advances in diagnostics & screenings. We create mass screening initiatives and if anyone suggests the unsustainable costs and potential harms? Well you can hear the sharpening of pitchforks while the dwindling few become hoarse whispering, "What about Bayesian priors"?
Picture
Picture

What more efficient way to remove the individual than to hype machine learning? It reminds me of a recent article in wired magazine. The Exaggerated Promise of So-Called Unbiased Data Mining.
Nobel laureate Richard Feynman once asked his Caltech students to calculate the probability that, if he walked outside the classroom, the first car in the parking lot would have a specific license plate, say 6ZNA74. Assuming every number and letter are equally likely and determined independently, the students estimated the probability to be less than 1 in 17 million.

​When the students finished their calculations, Feynman revealed that the correct probability was 1: He had seen this license plate on his way into class. Something extremely unlikely is not unlikely at all if it has already happened.


The Feynman trap—ransacking data for patterns without any preconceived idea of what one is looking for—is the Achilles heel of studies based on data mining. Finding something unusual or surprising after it has already occurred is neither unusual nor surprising. Patterns are sure to be found, and are likely to be misleading, absurd, or worse.
Picture
Picture
The cacophony regarding technology and digital transformation doesn't make any sense to me. Every single advance is pushing past the human elements. We create substitutes for understanding, trust, connection, awareness--but are we considering the cost? Do you believe the patient is the primary focus of digital start-ups or perhaps it is their data? What else can explain the large-scale investments.
What people wouldn't or wouldn't pay for with money, we would now pay for with personal data. But something larger had also changed. The platforms themselves were no longer in the business of delivering people to one another, they were in the business of delivering people to marketers. Humans were no longer the customers of social media. We were the product--Team Human by Douglas Rushkoff
I can't help think that if we were investing in social change instead of digital economics we would see investment levels of less orders of magnitude. Do you think investor shareholders are looking for marginal profits or growth-based capitalism? Think about the evaluation of twitter.

Forbes ranked twitter #21 at $24.7 billion dollars as of Jun 2018. Am I the crazy person in the room or does that seem like a lot of money for a social media built on connections? I think we both know that is not the purpose of the platform. We are the product. The price is for our attention. Power to our pupils.
It's not the meme that matters, but the culture's ability to muster an effective immune response against it.--Team Human by Douglas Rushkoff
What happens now? We either guide culture toward returning our autonomy and privacy by making different digital choices or we shrug our shoulders and pretend this is what innovation looks like.
Under the pretense of solving problems and making people's lives easier, most of our technological innovations just get people out of sight, or out of the way. This is the true legacy of the Industrial Age.
I get it. I don't have the pressure of working for a pharmaceutical company, industry stakeholder, or agency driven by profits and growth. But we all need time for contemplative thought. If corporations are externalizing costs, who is paying the price?

I would argue it is the consumer no longer offered the ability to "smooth consumption" over time as they pay for a prescription  unwittingly reimbursing the "innovators" for the price of  "R&D" at the point of sale--in the now.

​We are no longer being courted for our attention with thoughtful and beneficial information. It is a rush to the bottom. I would also argue that we have seen it coming. And decided to blink instead...

When data speaks, who is listening?

12/4/2018

 
Perhaps a bit early for resolutions but I need to claim this one publicly so I am called to accountability. In the past, I have viewed data visualization as sketches for launching bigger and more granular data projects. Nothing wrong with this approach but I am now thinking the crumbled pages or archived visualizations--with a little more refinement--might be worthy of attention on their own.

My favorite leader in the data viz revolution has been The Financial Times. I will admit it here, dear reader. I subscribe solely for the charts and data.
Picture
I tend to work as a design thinker. I need something visual before I can articulate or refine a problem. Before I see or look for data (depending on my role) I sketch a quick graphic to seed an idea of what might be the best option for visualizing the data and communicating little arguments.

A visual drawing also helps to communicate to teams about the type of data needed to address the question posed, and the feasibility of the approach. For example, without geospatial data, mapping patterns according to geography are not possible (although existing limited geocodes can be enhanced with a little artful integration of Shapefiles for example).

Here is a snippet from a book on my shelf, Design Thinking: Understanding How Designers Think and Work, the bolded words in the original text read "designer(s)" I swapped them for data scientist(s).

​It works.
Experienced data scientists know that it is possible to go on almost forever gathering information and data about a data problem, but that they have to move on generating solution proposals, which in themselves begin to indicate what is relevant information.

In a data project it is often not all clear what the 'problem' is; it may have been loosely defined by the client, many constraints and criteria may be un-defined, and everyone involved in the project may know that goals may be re-defined during the project.

​In data, 'problems' are often defined only in relation to ideas for their 'solution', and data scientists do not typically proceed by first attempting to define their problems rigorously. Of course, this can mean that important information is overlooked, or is discovered only very late in the data process, causing disruption and delay."
We need to pause and define the problems--rigorously.

​I like how sketches can draw you in even in the absence of a shared language. I included the graphic below because I got lost in its simplicity and ability to communicate ideas and situations.
Picture
Andy Kriebel recreated the visual vocabulary wonderfully and artfully to be interactive.
Picture

Scroll below to interact with Visual Vocabulary!--Andy Kriebel

Attention, patience, and time is well spent when launching data projects. In prior years, many of us worked in silos while our clients kept a distance.

Let's be honest, nobody on the client side particularly cared how the sausage was made. Now it is all about collaboration.

​All the technical mumbo jumbo hangs silently in the air and you better be able to rein it in and communicate clearly--whether its statistical modeling, null hypothesis testing, or helping to refine the data question.

We are listening...

persevere, prevaricate, or pivot

11/1/2018

 
How should we make decisions in our personal and professional lives? What are winning strategies and what are the unique perspectives flying relatively under the radar?
Maybe what I do for a living is unique--or maybe it isn't. But I will argue, it is certainly dynamic. In the last several weeks I attended the International Conference on Advances in Interdisciplinary Statistics and Combinatorics, National Academy of Medicine Annual Meeting--Cancers: Can We Beat the Odds, Quality Talks, AWS transformation day, and a refresher workshop on R Programming language.
Barry D. Nussbaum, former president of American Statistical Association and Chief Statistician of the Environment Protection Agency presented one of the most relevant and interdisciplinary keynotes--It's Not What We Said, It's Not What They Heard, It's What They Say They heard.
The main objective of the conference is to promote interdisciplinary research involving statistical techniques. These techniques are becoming increasingly important in all fields of scientific discovery.

A unique feature of the proposed conference is that we plan to bring together nationally and internationally recognized researchers from many fields including anthropology, biology, economics, education, environmental science, information systems, insurance, mathematics, medicine, psychology, and public health.--International Conference on Advances in Interdisciplinary Statistics and Combinatorics 2018
A quick review of Simpson's paradox reminds us of the translation of clinical trial data into the real world application of personalized medicine. Barry Nussbaum cautions--we need to be clear communicators of the little arguments in the graphics.
Picture
This isn’t your ordinary health care discussion. Quality Talks is a series of stirring, succinct talks by current and emerging health care leaders with ideas about how we can collaboratively improve American health care.

​In addition to the dynamic and thought-provoking speakers, the event features interactive dialogue among all attendees about advancing the health care system and improving patient care.--Quality Talks 2018
I absolutely enjoyed Quality Talks. Quick and clean Ted Talk style presentations followed by speaker hubs for face-to-face discussions with presenters. This should be the model of important discussions in healthcare. One of my favorite speakers was Bon Ku, MD. Looking at a system without handwringing or critical review--but looking to innovate and align new edges in healthcare.

​Here is a snippet of a talk he gave although not from Quality Talks--you can see how his novel approach to medical education needs to go viral. I am patiently waiting for Continuing Medical Education (CME) to innovate but so far--status quo prevails.
If you think about big data--think bigger. John Halamka MD is the Chief Information Officer at Beth Israel Deaconess Medical Center. In his Quality Talk, Will Health IT Finally Be Driven by Demand, he shared that he is storing 11 petabytes of data on AWS. To grasp the magnitude, think of a peta-second. We walked back into the room following a conference break. Halamka commented that a peta-second ago, a brontosaurus would have walked the space. Mind-boggling no?
When AWS Transformation Day comes to your town, I suggest you attend if possible. If you succeed in building big data communities for your clients, there is a temptation to blur the edges of your expertise and get lost in the infrastructure. In my own experience I believe you must be fluent in the process but what you might actually need at the table is an experienced data engineer. 

The lexicon appears to lump everything into broad categories such as data scientist--but that means nothing. You need someone to develop an architecture to help the process for analyses and processing as well as providing oversight to monitor systems.

The slide below shares the output from text generated machine learning using an image. Absolutely hysterical but also shows the limits of what we can glean from limited data.
Picture
I work in a mobile environment because that is where the data lives. I wanted to share a few insights from the last round of travel. I rely on the information when building presentations, projects, or even simply engaging across social media. You need a vast toolbox to curate from the edges. The edges have the best view...

​We can either remain with the status quo, distort reality with limited literacy around our data, or pivot toward doing better. Follow along for a series of data workshops to help you begin your journey...

"Full of sound and fury, signifying nothing"--Macbeth

9/30/2018

 
Picture
I think it was Yuval Noah Harari, author of Sapiens A Brief History of Humankind and Homo Deus A Brief History of Tomorrow during an interview for his new book 21 Lessons for the 21st Century proclaiming that it isn't really Big Data we need. It is clarity. In my mind, this speaks to how insights are gleaned from the right data--not just sheer volume.

​The loudest voices are often the most blustery and ill-informed. Think of all big buzzwords evolving as shiny and new--we work them into our presentations, write them into our narratives, and search for them in the fire-hose of media headlines.

Not a fan of generalities but here is one now. I ALWAYS query survey writers when they include vague terminology in their drafts of instruments. We can't measure what isn't defined. And in the oft chance we attempt to measure--the analyses is doomed. I am the same way when clients are sitting around a room pontificating on "value", "innovation", or the "patient" as the latest blockbuster in healthcare.

​We need to do a better job of detecting the signal from the noise.

Picture
Here is the regression model describing signal and noise. If you look at the right side of the equation, 􏰏
Picture
represents the signal, and ε is the noise. Anything not captured by the "signal" is therefore described as "noise" but could also contain signals of interest or signals that should be of interest.
"If I am sick and given a choice of treatments, the central question to me is which treatment has the best chance to cure me, not some randomly selected ‘representative’ person."--Xiao-Li Meng, Department of Statistics, Harvard University, Cambridge, MA

How do we decide which signals may or may not be of potential interest? I know, I know--it is easier to make claims to be on the side of the patient and far less interesting to look under the hood but when I see aggregated data telling me one thing, outcomes telling me another, and a large percentage of stakeholders simply looking in the other direction--I become suspicious.
Think about the data from immune-oncology clinical trials. We compare outcomes between groups without elaborating on a potentially undiscovered confounding or third variable. There is actually a term for this--Simpson's paradox.
For this paradox to occur, two conditions must be present: (a) an ignored or overlooked confounding variable that has a strong effect on the outcome variable; and (b) a disproportionate distribution of the confounding variable among the groups being compared (Hintzman, 1980; Hsu, 1989). The effect size of the confounding variable has to be strong enough to reverse the zero-order association between the independent and dependent variables (Cornfield et al., 1959), and the imbalance in the groups on the confounding variable has to be large (Hsu, 1989).

Key to the occurrence of this paradox is the 
combination of these two conditions, because unequal sample sizes alone generally are not a problem as long as they are not coupled with other internal validity issues, such as power. (For further reading on Simpson's paradox, see Neutel, 1997 and Rücker & Schumacher, 2008.)--Simpson's Paradox and Experimental Research
The complexity of innate and adaptive immunity is vast. In this era of personalized medicine we need to appreciate the infinite scope of "what we don't know that we don't know".

Heralding phase II results as ready for prime time conflicts with the need to avoid aggregation bias. Aggregation bias, also referred to as ecological bias, refers to crudely or partially adjusted associations failing to signal appropriate measurement of the effect of an exposure or treatment. This typically occurs due to differences in other risk variables among both control and treatment arms.

​How do we decide how to interpret complex clinical findings? I suggest improving our data literacy. And reading everything by Xiao-Li Meng.
​


The “personalized situation” also highlights another aspect that our current teaching does not emphasize enough. If you really had to face the un- fortunate I-need-treatment-now scenario, I am sure your mind would not be (merely) on whether the methods you used are unbiased or consistent. Rather, the type of questions you may/should be concerned with are

(1) “Would I reach a different conclusion if I use another analysis method?” or

(2) “Have I really done the best given my data and resource constraints?” or

​(3) “Would my conclusion change if I were given all the original data?”

From the article, 
A trio of inference problems that could win you a Nobel Prize in statistics (if you help fund it)

The full quote from Macbeth is a perennial favorite among many,

​Out, out, brief candle!
Life’s but a walking shadow, a poor player
That struts and frets his hour upon the stage
And then is heard no more. It is a tale
Told by an idiot, full of sound and fury,
Signifying nothing.

A willingness to engage with ideas...

9/17/2018

 
As human beings, we live in stories. It doesn’t matter how quantitative you are, we’re all influenced by stories. They become like statistics in our mind. So if you report the statistics without the story, you don’t get nearly the level of interest or emotion or willingness to engage with the ideas.--Rebecca Goldin
Picture
I enjoy statistical conferences. No, I am not joking. One of the first I attended was the National Conference on Health Statistics at the National Institutes of Health (NIH) years ago. Not a lot of storytelling coming from the podium in favor of highly technical esoteric discussions about models, statistical approaches, and research specific methodologies.

But I did learn a lot about data sources, Integrated Public Use Microdata Series (IPUMS), a wide variety of resources for learning R, and working heteroscedasticity into conversations about data.

Fast forward a few years and I noticed something interesting. Apparently there really aren't a lot of industry adjacent attendees at these conferences. Mostly statisticians and biostatisticians in the audience but not many data centric professionals like me. But here is the rub. If healthcare industry professionals aren't engaged with your sessions--whom are they for? So I ask the questions and learn more than I ever gleaned from a statistics course.

STATS Sense About Stats is a resource you don't want to forget. I use their services when I work out a complex model and want to make sure it makes sense. Not to be confused with STAT--a news source that throws all of the meaningful reporting behind a paywall--STATS assists for no charge.

Statistics is for all of us--otherwise what is the point? 

If you are thinking it doesn't matter to your work--think again. We make a lot of assumptions about p-values and what they mean not only in our own research but in the clinical research we consume to make decisions at the point of care. What if those are the wrong assumptions?
​
"In logical terms, the P value tests all the assumptions about how the data were generated (the entire model), not just the targeted hypothesis it is supposed to test (such as a null hypothesis). Furthermore, these assumptions include far more than what are traditionally presented as modeling or probability assumptions—they include assumptions about the conduct of the analysis, for example that intermediate analysis results were not used to determine which analyses would be presented."--Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.

​Data literacy is important. It is even more important today if you communicate your findings, thoughts, or ideas. Data rarely tell us absolute findings. We need to discover probabilities and communicate uncertainty around the answers we seek. For example, if the number needed to treat (NNT) is 100--what happens to the other 99? I am reminded of Princess Bride and often have to repeat "I do not think it means what you think it means"

Here is what I learned in the last few weeks alone:

  • Charts are arguments 
  • Maps represent territory not people 
  • Ecological fallacy--findings may not apply if aggregated at different level
  • Don’t simplify.  Clarify
  • Country data can’t be extrapolated to individual level
  • Simpsons Paradox--problems result from combining data from different groups.
  • Charts are tools--They extend our brain

​
Like, you might be interested in knowing whether taking hormones is helpful or harmful to women who are postmenopausal. So you start out with a question that’s really well-defined: Does it help or hurt? But you can’t necessarily answer that question. What you can answer is the question of whether women who take hormones whom you enroll in your study — those specific women — have an increase or decrease in, say, heart disease rates or breast cancer rates or stroke rates compared to a control group or to the general population. But that may not answer your initial question, which is: “Is that going to be the case for me? Or people like me? Or the population as a whole?”--Rebecca Goldin

A data career inspired...but not planned

9/11/2018

 
Picture
I was recently asked a question so simple in the asking, but a little more complex in the response.

What is a data model?

The reason the answer isn't so simple is because I am usually consulting downstream from the data modeling. Not an enviable position but in most cases, the importance of agreeing on which metrics to include, what data to collect, and a cohesive set of data definitions is overlooked--until it is too late. And then--my phone rings.
A database is only as good as its model. I don't do the actual modeling but I prefer to understand the architecture and limits of the data I will be asked to analyze. Think of interoperability but instead of being external to your organization or data practice--it also describes pulling in data from multiple data environments. The big challenge on my end is locating disparate data and trying to determine if they are indeed measuring the same thing. Most of my familiarity is with top-down models but at this point I would have to admit--I have seen it all.
My colleagues in the data world likely share my frustration regarding the focus on debates between data scientists, statisticians, data analysts, programmers, and any other group they can successfully work into a froth. I gravitate toward data scientist but mainly because I actually tend to spend my day similarly to the HBR article description below...
...working data scientists make their daily bread and butter through data collection and data cleaning; building dashboards and reports; data visualization; statistical inference; communicating results to key stakeholders; and convincing decision makers of their results.--What Data Scientists Really Do, According to 35 Data Scientists--Hugo Bowen-Anderson
The transition and evolution is real. I don't know about you but I am constantly updating skills and modern methods of analysis. How many of us can afford a loyalty to SPSS and SAS if the best solution is R or Python? Because I access a wide variety of data registries I need to know how to work with SAS and SPSS. If a government database allows downloads of their raw data only in XML, well I better know how to harmonize these data resources for visualization in Tableau or Flourish.
​
One result of this rapid change is that the vast majority of my guests tell us that the key skills for data scientists are not the abilities to build and use deep-learning infrastructures. Instead they are the abilities to learn on the fly and to communicate well in order to answer business questions, explaining complex results to nontechnical stakeholders. Aspiring data scientists, then, should focus less on techniques than on questions. New techniques come and go, but critical thinking and quantitative, domain-specific skills will remain in demand.--What Data Scientists Really Do, According to 35 Data Scientists--Hugo Bowen-Anderson
Picture
I love Elena's solution below. If the industry can agree on what we specifically mean when we say we need a data scientist--we can align objectives with workable tasks and outcomes. Think about it--do we say we need engineering professionals without specificity and appreciation for the granularity of skills for the different levels of engagement? Mechanical? Chemical? Civil? Electrical? Petroleum? All have a purpose unique to a generalized definition.

Perhaps there are mythical hordes of talent that know 8 different programming languages, well-versed in statistical and predictive modeling, machine learning, clustering and classification, Python, Scala, and Java, SQL, Hive, Spark, etc. and are subject matter experts, can data source, and communicate all of the processes and deliverables clearly to all stakeholders.

Good luck finding them--I don't have time for all of that. Too much work to do...

Elena Grewal, Head of Data Science at Airbnb...

Picture
I am Analytics track most days but need to consume massive amounts of Inference or I would be lost. We need to evolve along with the deluge of data threatening to obscure our ability to operationalize the promises of AI and IoT.

​Or we are just smacking ourselves in the head with fish. Listen to the podcast below from The Economist Radio. The Secret History of the Future: The Body Electric. 

We’ve used electricity to treat our brains for thousands of years, from placing electric fish on our heads to cure migraines to using electroconvulsive therapy to alleviate depression. But over time, our focus has shifted from restoring health to augmenting our abilities.

Think of data like a vitamin--essential but often ignored. 

"Medicine is a science of uncertainty and an art of probability"

8/28/2018

 
Picture
At this stage of the game I have one son returning to high school and one out experiencing a gap year working for Americorp as a Vista in the wilds of the upper east side of Manhattan. Both spent formative years being educated in a Montessori curriculum and have rock solid educational skills. But neither of them possess a love of the maths.

​I have my theories about math in general. We are taught so much about certainties in algebra, trigonometry, and geometry that we are robbed of the "what if" fun stuff that we use everyday. I always argued that if students were taught applications of mathematical concepts like risk and probabilities we would all be better for it.
Discussing data literacy in front of audiences requires precise definitions. For example, risk and uncertainty are not the same thing. When we discuss risk we are informed of the set of alternate possible outcomes--and we have probability theory and statistics to help us along.

​But if we imagine uncertainty this reflects a larger scale where we can't possibly know every potential alternative or consequence we have to make estimates and rely on heuristics.

In a world of risk (small world), all relevant alternatives, their probabilities, and their consequences are known for sure and the future is certain. In contrast, in a world of uncertainty (large world) part of the information is unknown or has to be estimated from small samples, and surprises can happen.

The second distinction we introduced is between what decisions people make (the outcome) and how they make them (the process). Answering the first question leads to as-if models; answering both questions leads to process models. We argue that the two distinctions are correlated: As-if models tend to match small world studies, whereas process models tend to match large world studies.--Volz and Gigerenzer, Cognitive Processes in Decisions Under Risk are not the Same as in Decisions Under Uncertainty

Risk literacy speaks to the scarcity of opportunities to learn about uncertainty within our own disciplines. We are told to participate in mass disease screening, rely on phase II clinical research although limited in population size, accept the results of clinical articles regardless of intentional obfuscation.

​Striving for improved data literacy continues to be an important goal within our healthcare ecosystem. How are you tackling the problem?

Medicine is a science of uncertainty and an art of probability--William Osler, Canadian physician and one of the four founding professors of Johns Hopkins Hospital

Assessing complexity in our data agile world...

8/21/2018

 
A public life in any field solicits a fair share of requests for mentorship, guidance, and employment. Not to be cheeky but I rarely receive requests worthy of a response.

Before you think I am a monster read below:

I would really appreciate if you can give me leads or guide me! 

Heyy... i have experiance of 2 plus year as a tableau developer

Kindly review my profile and let me know if you have any suitable opportunity for me in medical writing. Looking forward to hear from you.


These don't work. Here is one that did--not because it flattered but because it showed a little skin in the game...
Picture

  • I read a few of your posts via Medium, based on just a few posts, I enjoy your unique, intellegent & fresh takes. In short, it's not the same conversations we hear in healthcare for the past decades. I saw that you're available for for consulting & advising, are you open for a quick chat next week?

You must tell me...

When seeking connections or guidance remember to first be of service. Write a few articles on a topic that reflects your expertise. Polish it up nice and shiny and share it with someone for free. Show that you aren't afraid to roll up your sleeves and contribute. You don't want to be yelling at a fireplace demanding warmth before at least throwing a log into the hearth...life doesn't work that way.
I am also fairly confident that nobody wants to hear how hard it really is. Successful colleagues show up every day. They may not be the cleverest or best writers or communicators but you know what they actually are? Deliberate and consistent. They will get there first and outlast you if you bring anything less than your "A" game to the table.

​Did you know that only around 10 percent of people are early risers?

My latest book will be driven by how a number or statistic comes to the surface and generates a story. Listening to Seth Godin's podcast has empowered me to write for the smallest viable audience. You can create something and not worry if it is an instant blockbuster--we already know how to appeal to the masses--write for the average. Don't seek something special or unique for your few true believers.

​But maybe that isn't where you want to be. How can you find a creative spark?

​Hint: it usually starts with a podcast ...
When on terra firma, I rely on podcasts--all kinds. Blending different industries and building new skills are paramount. Words of wisdom by Tony Hawk, he learned new skills to build a broad base of technique whether he was interested in the skills or not.

Align two industries or ideas along the edge and that is where innovation lives. I have a broad library of audio to listen during long trail runs or drives to DC (did I mention I was just accepted to National Press Club?). A bit of design, medicine, and economics mixed liberally with technology and health, machine learning and culture leaves me puzzled when asked about writer's block.
Because I work with data in an ecosystem of architecture, governance, analytics, and curation I am accustomed to cross pollinating a bit. When I see agile systems benefiting software development I think--"how can this apply to a current or potential client's data challenges?".
Picture
I will share an idea I adapted from a podcast. I started using the Fibonacci sequence in data planning meetings. The Fibonacci number is a series of numbers generated by the sum of the two preceding numbers, for example, 1,1,2,3,5,8,13...on and on.

Its utility in data projects allows each member of the team to use a card to represent the perceived complexity of a task. In software development this helps with story pointing or time estimates for work but for me--It allows me to level set expectations and gauge if the team is being realistic or if we need to spend a little more granular time discussing aspects of the project or overall strategy.
Picture
The beauty of the Fibonacci numbers are so that we aren't dealing with negligible increments of detail. A 13 is discernible from an 89 when trying to measure complexity in a project and why we may need 6 weeks instead of the pie in the sky 3 week timeline.

How did you do it?

Stay tuned  for the data resources I will be using to write a memoir driven by data. Think about Harper's Index and how the ember of an idea can be scaled to an actual narrative. I think once it gets rolling I can play the "cake" scrum card. Easy as a piece of cake...
Picture
<<Previous

    Telling stories...

    I am a data storyteller. The crossroad of health economics, health policy, and medicine is my bonfire. I hope you will sing along...

    Take a look around...
    Follow @datamongerbonny

    Categories

    All

    twitter...

    Tweets by datamongerbonny
Proudly powered by Weebly