Not sure if this escaped your interest but we have had our own modern day water pump in theories about the widespread COVID-19 in New York. The part of the water pump is being played by subway turnstiles and is a fascinating read if nothing else. The following is a working paper that reads like a conversation--one of the reasons I enjoyed reading and interpreting the data. Full disclosure please be aware that this National Bureau of Economic Research paper is not peer-reviewed and is circulated for discussion and comments only.
THE SUBWAYS SEEDED THE MASSIVE CORONAVIRUS EPIDEMIC IN NEW YORK CITY
New York City’s multitentacled subway system was a major disseminator – if not the principal transmission vehicle – of coronavirus infection during the initial takeoff of the massive epidemic that became evident throughout the city during March 2020. The near shutoff of subway ridership in Manhattan – down by over 90 percent at the end of March – correlates strongly with the substantial increase in the doubling time of new cases in this borough. Maps of subway station turnstile entries, superimposed upon zip code-level maps of reported coronavirus incidence, are strongly consistent with subway-facilitated disease propagation. Local train lines appear to have a higher propensity to transmit infection than express lines. Reciprocal seeding of infection appears to be the best explanation for the emergence of a single hotspot in Midtown West in Manhattan. Bus hubs may have served as secondary transmission routes out to the periphery of the city.
For each station, the idea is first to compute the time trends in turnstile entries and coronavirus incidence, and then assesses whether there is a relation between the two trends across different subway stations (Fredriksson and Oliviera 2019). Unfortunately, there is a serious problem with this extraordinarily popular method of doing policy analysis (Bertrand, Duflo, and Mullainathan 2004). In particular, there is likely to be significant serial correlation in the outcomes among adjacent subway stations situated along the same line.
Following the realization that looking at the individual subway stations may not be the appropriate unit of analysis, the discussion reveals the utility of considering subway lines. I will summarize the static model of epidemic propagation discussed in more detail in the paper but basically susceptible individuals are classified as S and their contact with infectious individuals is classified as I.
Incidence of new infection depends on the frequency of contact between S and I and the probability that there is transmission of infection.
The Goscé model offers a number of insights that are immediately applicable to the data from the New York City Flushing subway line. The first is that the rate of disease transmission is related to the number of trips and average number of stations per trip along the entire subway line, and not just to the number of entries at any one subway station. Second, passengers entering the subway line even at a remote, less populous station are slowing down the system, thus increasing the transit time that the S’s stay in contact with the I’s. Third, those uninfected S- passengers who cram shoulder-to-shoulder into a particular subway are increasing train-car density and thus raising the average number of other S-passengers infected by an I-passenger who happens to be standing in the middle of the train. Fourth, local trains – like the Flushing local – are more likely to seed epidemic infections than express lines. Finally, an entire subway line, rather than the individual stations or subway cars, is the appropriate unit of analysis.
An important consideration is the impact of reducing train service likely accelerated the spread of virus as commuters found themselves crammed into fewer cars for longer periods of time.
One distinguishing factor between the present study and prior work is that seasonal influenza has generally had a reproductive number R in the range of 1.2–1.4, while pandemic influenza has had an R in the range of 1.4–1.8, with the high end representing the 1918 pandemic (Biggerstaff et al. 2014). By contrast, we have estimated the R in New York City during the initial surge of infections in early March to be on the order of 3.4 (Harris 2020). An overall assessment of these research efforts may lead some scientific reviewers to conclude that cause-and-effect remains difficult to prove. Still, we doubt whether any public health practitioner would be reluctant to take action on the basis of the facts we now know.
Harris, J. E. 2020. The Coronavirus Epidemic Curve Is Already Flattening in New York City.
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3563985: National Bureau of Economic Research Working Paper No. 26917, April 3, 2020