Predicting Future Daily Covid-19 Cases “Tunisia” Coronavirus

4 min readMay 31, 2020


Get the latest Coronavirus data with a unique Python package
A tiny Python package for easy access to up-to-date Coronavirus (COVID-19, SARS-CoV-2) cases data.

In order install this package, simply run:!pip install COVID19Py##Collecting COVID19Py
##Downloading COVID19Py-0.3.0.tar.gz (4.9 kB)
##Successfully installed COVID19Py-0.3.0
# loading packages and libraries
import pandas as pd
import numpy as np
import as px
import plotly.graph_objects as go
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
import COVID19Py
# Create a New Instance to Access Data Source
covid19 = COVID19Py.COVID19()
# Choosing a data source, Getting location by country code ( or specific location for example Select Tunisia id 212timeline = covid19.getLocationById(212)['timelines']['confirmed']['timeline']

dic = { 'Date' : list(timeline.keys()),
'Cases': list(timeline.values())}

df = pd.DataFrame.from_dict(dic)
fig = go.Figure(data=go.Scatter(x=df.Date, y=df.Cases, mode='lines+markers'))

1- 41 days, Tunisia had 0 cases, we will delete these lines.
2- The number of cases is cumulative. We will cancel the accumulation.

corona = df.copy()
# delete the first 41 lines
corona = corona[41:]
corona = corona.reset_index(drop=True)
# cancel the accumulation
corona = corona.set_index('Date')
corona = corona.diff().fillna(0).astype(np.int64)
fig = go.Figure(data=go.Scatter(x=corona.index, y=corona.Cases, mode='lines+markers'))


# Splitting Data into a Training set and a Test set
test_data_size = 20
train_data = corona[:-test_data_size]
test_data = corona[-test_data_size:]
train_set = train_data.values
test_set = test_data.values
train_set = train_data.values
test_set = test_data.values
# To increase the training speed and performance of the model, we'll use the MinMaxScaler from scikit-learn ( scale the data values between 0 and 1 )
#Initialising the MinMaxscaler ()
scaler = MinMaxScaler(feature_range = (0, 1))#Transforming training and test values train_set = scaler.fit_transform(train_set)
test_set = scaler.fit_transform(test_set)
# Currently, we have a big sequence of daily cases. We'll convert it into smaller ones:def sequences(data, seq_length):
X_values = []
Y_label = []
for i in range(seq_length, len(data)):
X_values.append(data[i-seq_length:i, 0])
Y_label.append(data[i, 0])
return np.array(X_values), np.array(Y_label)
# Create sequences
train_X, train_Y = sequences(train_set, seq_length=7)
test_X, test_Y= sequences(test_set, seq_length=7)

Building the model

model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(train_X.shape[1], 1)))
model.add(LSTM(units=50, return_sequences=True))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dense(units = 1))
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
history =, train_Y, epochs = 100, validation_data=(test_X,test_Y))### Let's have a look at the train and test loss:fig = go.Figure()
fig.add_trace(go.Scatter(x=[i for i in range(1,54)], y=history.history['loss'],mode='lines+markers',name='Training loss'))
fig.add_trace(go.Scatter(x=[i for i in range(1,54)], y=history.history['val_loss'],mode='lines+markers',name='Valid loss'))

Predicting daily cases

test_inputs = corona[-test_data_size-7:].values
test_inputs = test_inputs.reshape(-1,1)
test_inputs = scaler.transform(test_inputs)
features_X , features_Y = sequences(test_inputs, seq_length=7)features_X = np.array(features_X)
features_X = np.reshape(features_X, (features_X.shape[0], features_X.shape[1], 1))

features_Y = np.array(features_Y)
features_Y = np.reshape(features_Y, (features_X.shape[0], 1))
predictions = model.predict(features_X)# We have to reverse the scaling of the test data and the model predictions:features_Y = scaler.inverse_transform(features_Y)
predictions = scaler.inverse_transform(predictions)
features = [list(i)[0] for i in list(features_Y )]
predict = [list(i)[0] for i in list(predictions)]
fig = go.Figure()
fig.add_trace(go.Scatter(x=corona.index[:len(train_data)], y= corona.Cases[:len(train_data)], mode='lines+markers',name='Historical Daily Cases'))
fig.add_trace(go.Scatter(x=corona.index[-len(test_data):],y=features , mode='lines+markers',name='Real Daily Cases'))
fig.add_trace(go.Scatter(x=corona.index[-len(test_data):], y=predict, mode='lines+markers',name='Predicted Daily Cases'))


The model performance is not that great, but this is expected, given the small amounts of data. The problem of predicting daily Covid-19 cases is a hard one. We’re amidst an outbreak, and there’s more to be done. Hopefully, everything will be back to normal after some time.




Written by Wajdi HAJJI

Data Scientist and Machine Learning Enthusiast ❤❤❤

No responses yet