Home

Awesome

<a href="https://covid19datahub.io"><img src="https://storage.covid19datahub.io/logo.svg" align="right" height="128"/></a>

COVID-19 Data Hub Twitter URL

Funded by the Institute for Data Valorization IVADO in 2020. Supported by the R Consortium from 2021 to 2024. Funded by the University of Lugano USI in 2025.

JOB OFFER (published November 5, 2024): Hiring a research assistant with a Master's or higher degree to work on the COVID-19 Data Hub, starting ASAP! The position is funded by the University of Lugano, Switzerland. Possibility to work partially or fully remotely. Read more here.

This repository aggregates COVID-19 data at a fine-grained spatial resolution from several sources and makes them available in the form of ready-to-use CSV files available at https://covid19datahub.io

VariableDescription
confirmedCumulative number of confirmed cases
deathsCumulative number of deaths
recoveredCumulative number of patients released from hospitals or reported recovered
testsCumulative number of tests
vaccinesCumulative number of total doses administered
people_vaccinatedCumulative number of people who received at least one vaccine dose
people_fully_vaccinatedCumulative number of people who received all doses prescribed by the vaccination protocol
hospNumber of hospitalized patients on date
icuNumber of hospitalized patients in intensive therapy on date
ventNumber of patients requiring invasive ventilation on date
populationTotal population

The dataset also includes policy measures by Oxford's government response tracker, and a set of keys to match the data with Google and Apple mobility reports, with the Hydromet dataset, and with spatial databases such as Eurostat for Europe or GADM worldwide.

Administrative divisions

The data are provided at 3 different levels of granularity:

Download the data

All the data are available to download at the download centre.

How it works

COVID-19 Data Hub is developed around 2 concepts:

To extract the data for one country, different data sources may be required. For this reason, the code in the R folder is organized in two main types of files:

The ds_ files implement a wrapper to pull the data from a provider and import them in an R data.frame with standardized column names. The iso_ files take care of merging all the data sources needed for one country, and to map the identifiers used by the provider to the id listed in the CSV files. Finally, the function covid19 takes care of downloading the data for all countries at all levels.

The code is run continuously on a dedicated Linux server to crunch the data from the providers. In principle, one can use the function covid19 from the repository to generate the same data we provide at the download centre. However, this takes between 1-2 hours, so that downloading the pre-computed files is typically more convenient.

Contribute

If you find some issues with the data, please report a bug.

Academic publications

The first version of the project is described in "COVID-19 Data Hub", Journal of Open Source Software, 2020. The implementation details and the latest version of the data are described in "A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution", Scientific Data, Nature, 2022. You can browse the publications that use COVID-19 Data Hub here and here. Please cite our paper(s) when using COVID-19 Data Hub.

Cite as

We have invested a lot of time and effort in creating COVID-19 Data Hub, please cite the following when using it:

Guidotti, E., Ardia, D., (2020), "COVID-19 Data Hub", Journal of Open Source Software 5(51):2376, doi: 10.21105/joss.02376.

A BibTeX entry for LaTeX users is:

@Article{guidotti2020,
    title = {COVID-19 Data Hub},
    year = {2020},
    doi = {10.21105/joss.02376},
    author = {Emanuele Guidotti and David Ardia},
    journal = {Journal of Open Source Software},
    volume = {5},
    number = {51},
    pages = {2376}
}

The implementation details and the latest version of the data are described in:

Guidotti, E., (2022), "A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution", Sci Data 9, 112, doi: 10.1038/s41597-022-01245-1

A BibTeX entry for LaTeX users is:

@Article{guidotti2022,
    title = {A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution},
    year = {2022},
    doi = {10.1038/s41597-022-01245-1},
    author = {Emanuele Guidotti},
    journal = {Scientific Data},
    volume = {9},
    number = {1},
    pages = {112}
}

Terms of use

By using COVID-19 Data Hub, you agree to our terms of use.

Supported by

<div style="height:96px"> <img height="96" src="man/figures/RConsortium.png" alt="R Consortium" style="margin-right:8px"/> <img height="96" src="man/figures/ivado.png" alt="IVADO" style="margin-right:8px"/> <img height="96" src="man/figures/hec-montreal.jpg" alt="HEC Montréal" style="display:inline-block;margin-right:8px" /> <img height="96" src="man/figures/hackzurich.jpeg" alt="Hack Zurich" style="display:inline-block;margin-right:8px" /> <img height="96" src="man/figures/unimi.jpg" alt="Università degli Studi di Milano" style="display:inline-block;margin-right:8px" /> </div>