What is Model & Scan COVID Trends (MSCT)?
MSCT is able to model COVID-19 epidemic and predict the end of the most severe phase of the epidemic for each Italian Region. MSCT is constantly adjusting the estimations by incorporating daily updates in the model, and the outcomes
are presented in this app
First select what kind of data you want to visualize
Some data is the result of running the MSCT module (e.g: Model Score, Remaining days), others is a summary from public institutions data
How?
We calculated a derived variable called Epidemic Indicator, given by the sum of the number of patients in Intensive Care Unit (ICU) with the number of daily deaths. We noticed that this variable follows a regular descending trend, after the peak registered
on April 1st. In fact, it is an easily measurable information which doesn't suffer from sampling methodology or weekly fluctuations. It is certainly not perfect since it can also be prone to underestimation, mostly during the acute peak
of the crisis where the health system was stressed and hospitals saturated. But after that critical phase, the Epidemic Indicator should reflect pretty well the number of patients that are affected most severely by the virus
The main assumption of our analysis is that Epidemic Indicator can be a reliable variable for estimating the current epidemic progression and for modeling future trends. In fact, after April 1st this variable is decreasing at a constant
rate. For this reason, we can model the future trend of the Epidemic Indicator using Linear Regression, one of the most popular classical Machine Learning algorithms for supervised learning. This algorithm is relatively easy to implement
and works well when the relationship between covariates and response variable is known to be linear (in our case: Epidemic Indicator VS time). A known disadvantage is that Linear Regression over simplifies many real world problems.
We separated the dataset in 20 regional subsets
Then, we created and trained a Linear Regression Model for each region. Trained models are then used for predicting the end of the epidemic, expressed by the number of days remaining before Epidemic Indicator is expected to be 0. Of course,
not all regional models performed the same. Some regions have clear recovering trends (i.e. Lombardia), while others seem to have a less linear evolution (i.e. Lombardia). For this reason, every model also returns a performance score,
the coefficient of determination R² of the prediction. An R² of 1 indicates that the regression predictions perfectly fit the data. therefore the closer the value gets to 1, the more we can trust our model and its predictions
More technical information about the model can be found at
this link
Who?
Pere Roca Ristol, GIS and web-mapping development (JRC, Ipra)
Andrea Amparore, modelling MSCT (World Food Programme, Rome)