Country Profile: United Kingdom
This project explores the development of an infectious disease caused by a type of coronavirus known as SARS-CoV-2. Specifically, the following questions are answered by the data:
What is the daily number of reported cases?
What is the cumulative number of reported cases?
What is the daily number of confirmed deaths?
What is the cumulative number of confirmed deaths?
What is the daily number of new tests?
What is the cumulative number of confirmed tests?
What is the death rate (ratio of confirmed deaths to reported cases)?
What fraction of tests returned a positive result?
How many cases, deaths and tests were recorded for each day of the month?
How many cases, deaths and tests were recorded for each day of the week?
The Project
The data set is a collection of the COVID-19 data maintained by Our World in Data (owid). It is updated daily and includes metrics on confirmed cases, deaths, and testing, as well as other variables of potential interest. The data is extracted directly from a URL link to the owid site, and the analysis conducted using Jupyter Notebook running on a Python kernel.
What We Learned
setting a default global theme for visualisations to ensure consistency across the notebook
using the urllib module to extract data directly from the web
adjusting bin size with histograms to draw out trends
formatting heatmaps to make it reader-friendly
use of colour-coded alert boxes in markdown language to highlight notes, warnings, and dangers
feature engineering to extract month, day, and hourly elements from DateTime objects
using tick marks to rename axis labels
The Code and the Report
GitHub repository for the data and the Jupyter Notebook
the PDF report can also be found here