Researchers use machine learning to build COVID-19 predictions
Two assistant professors from Watson School build computer models to anticipate infections
As parts of the U.S. tentatively reopen amid the COVID-19 pandemic, the nation’s long-term health continues to depend on tracking the virus and predicting where it might surge next.
Finding the right computer models can be tricky, but two researchers at Binghamton University’s Thomas J. Watson School of Engineering and Applied Science believe they have an innovative way to solve those problems, and they are sharing their work online.
Using data collected from around the world by Johns Hopkins University, Arti Ramesh and Anand Seetharam — both assistant professors in the Department of Computer Science — have built several prediction models that take advantage of artificial intelligence. Assisting the research is PhD student Raushan Raj.
Machine learning allows the algorithms to learn and improve without being explicitly programmed. The models examine trends and patterns from the 50 countries where coronavirus infection rates are highest, including the U.S., and can often predict within a 10% margin of error what will happen for the next three days based on the data for the past 14 days.
“We believe that the past data encodes all of the necessary information,” Seetharam said. “These infections have spread because of measures that have been implemented or not implemented, and also because how some people have been adhering to restrictions or not. Different countries around the world have different levels of restrictions and socio-economic status.”
For their initial study, Ramesh and Seetharam inputted global infection numbers through April 30, which allowed them to see how their predictions played out through May.
Certain anomalies can lead to difficulties. For instance, data from China was not included because of concerns about government transparency regarding COVID-19. Also, with health resources often taxed to the limit, tracking the virus’ spread sometimes wasn’t the priority.
“We have seen in many countries that they have counted the infections but not attributed it on the day they were identified,” Ramesh said. “They will add them all on one day, and suddenly there’s a shift in the data that our model is not able to predict.”
Although infection rates are declining in many parts of the U.S., they are rising in other countries, and U.S. health officials fear a second wave of COVID-19 when people tired of the lockdown fail to follow safely guidelines such as wearing face masks.
“The main utility of this study is to prepare hospitals and healthcare workers with proper equipment,” Seetharam said. “If they know that the next three days are going to see a surge and the beds at their hospitals are all filled up, they’ll need to construct temporary beds and things like that.”
As the coronavirus sweeps around the world, Ramesh and Seetharam continue to gather data so that their models can become more accurate. Other researchers or healthcare officials who want to utilize their models can find them posted online.
“Each data point is a day, and if it stretches longer, it will produce more interesting patterns in the data,” Ramesh said. “Then we will use more complex models, because they need more complex data patterns. Right now, those don’t exist — so we’re using simpler models, which are also easier to run and understand.”
Ramesh and Seetharam’s paper is called “Ensemble Regression Models for Short-term Prediction of Confirmed COVID-19 Cases.”
Earlier this year, they launched a different tracking project, gathering data from Twitter to determine how Americans dealt with the early days of the COVID-19 pandemic.