The COVID-19 Butterfly Effect
Without question, COVID-19 has had massive effects on data. There are obvious changes that we see reported in the news – hospital revenues down, commercial air travel is down, etc. But each of those changes can create longer term changes in the data that analyst and data scientists will rely on in the future. In this blog, we will review the COVID-19 “Butterfly Effect” and how we can “cure” our COVID-19 data woes.
The COVID-19 Butterfly Effect
Almost daily we see stories around the effect of coronavirus in hospitals. ERs and ICUs are full of patients. But with the increase in coronavirus cases, we also see a decrease in the volumes of other departments. From cancellations of elective surgeries to patient’s fears around going to the office, these changes in volume can have long standing effects. Large organizations depend on volume and revenue forecasting for everything from staffing to budgeting. Additionally, from a clinical perspective, we could see additional need for hospitalization in the future due to delayed preventative or maintenance care. These delays in care could also cause changes in how we measure disease progression and mortality.
Another hard-hit industry has been commercial air travel. Beginning with shutdowns and continuing with most consumers avoiding unnecessary travel, airlines are making fewer flights. This change in flight volume data will certainly affect forecasting for travel and airlines. But the long-ranging affects are far greater. Commercial airlines are one of the major data sources for upper atmosphere data with sensors providing near real-time updates. This data is utilized in forecasting weather, measuring air quality, and identifying hurricane paths, and is being submitted at about a quarter of the normal rate due to the pandemic.
So, what’s next?
The cure for COVID-19 data issues is highly dependent on how COVID anomalies show up in the data. Some data saw a quick spike followed by a return to normal. Others, like Zoom stock, saw a spike that created a “new normal”. Travel industries saw massive downturns that will be followed by a slow return to the normal.
Considering the case of a hand sanitizer distributer, we wouldn’t expect to see another mad dash on hand sanitizers. As with other outlying single events in a time series, we may be able to remove outliers by either a) ignoring the data or b) imputing non-COVID values in the time series. Imputation is a great tool for handling missing data but requires precision to account for as much underlying variance as possible. This can be handled with business logic, in the case that most of the trends are known, or with machine learning in the case of a large dataset with many unknowns.
In the circumstance of a cruise line, we need to account for variability based on COVID to manage the slower recovery. Analysis into variables that may measure the coronavirus effect is necessary. These could include things like public perception (tracking mentions on google or in the news), actual exposure numbers, and factors like the extent of lockdown measures. For travel specifically, a dampening effect of days since the initial lockdown may be utilized to understand consumer behavior. With these factors accounted for in both the historical and present data, a model can rapidly adapt to understand what the best-case and worst-case scenarios may look like.
From a data standpoint, we need to be better prepared for massive shifts, regardless of their cause. In a post COVID world, one can only hope that companies take the following lessons to heart:
- Store more data. It can be cumbersome, but storing raw data is the best way to ensure preparedness for change.
- Standardize measurements. If everyone is using a slightly different definition, it’s hard to track progress. It might not be noticeable in normal times, but when the numbers get extreme, slight differences start to surface!
- Measure and Mitigate early. Use analytics and data to prepare, understand, and strategize around changes.