Apparently I am an extroverted introvert. When in social situations I am calm, friendly, and have been accused of being mildly entertaining. I thrive both literally and figuratively on public speaking and being at the podium.The problem is--I prefer to stay home or with small groups of friends.
I tell you this to give you an idea of how I have been tempering the changes of the last month or so. Delays in projects, rescheduling of talks, and serious doubts about the status of conference appearances for the remainder of the year not withstanding--I remain pretty good. I have long been a creature of habit and more importantly, a remote worker. My last W-2 gigs (many years ago) were also jobs where I worked from a remote office and traveled to client locations or to the office on a quasi-quarterly basis.
Many of us have been watching the pandemic and either relying on data visualizations or recreating our own from raw data. The problem is--there are many missteps and fumbles around what the data is actually capable of contributing to the narrative.
If you are an epidemiologist or have studied epidemiology for public health many of the miscommunications are quite obvious. I think we could all use a better foundation in data literacy and fluency and what better place to start then with a map from Johns Hopkins Coronavirus Resource Center. The data is available for download in GitHub and you will find instructions and guidance. I recommend you read the resources providing information on the terms used to describe the pandemic and important guidelines regarding epidemiology.
Click to set custom HTML
When I view the map, the red sort of creates an ominous and deadly vibe. Yes, people are dying but perhaps we need to see context to understand--fear mongering will only get us so far. I barely noticed the green font depicting the number of people recovered. If the red dots are indicating confirmed cases it is much worse. Confirmed only means they were validated with a test--a test with its own biases and limitations. And we know in the US at least we are limited in testing or even providing the tests to populations of people in our communities.
Context is king when working with large complex datasets.
There are important considerations that need to accompany any visualization but COVID-19 data has a time horizon that is critical to clarify. For example, when were national measures enacted like shelter in place, or self-isolation (shown here by star symbols). What happens if we are only measuring confirmed cases in areas where tests are known to be largely unavailable or limited?
I personally prefer the selection of a logarithmic scale on the y-axis to better convey exponential viral growth. There is a lot to discuss in this graphic. Did you notice that the US does not have a marker indicating a national message to shelter in place? If we observe countries that have issued national orders--how long before the bend in the curve is evident?
Here are a few resources to help you make better visualizations...