Home

Awesome

Rendered
README

Daily Portuguese COVID-19 Data

Last updated: Thu 17 Nov 2022 (16:20:00 UTC [+0000])

Download User Friendly Version

<br>

Source

For more information about the data and variables see: https://github.com/dssg-pt/covid19pt-data

The original data were downloaded from an API provide by VOST https://covid19-api.vost.pt/Requests/get_entry/

Summary: Last 10 (available) Days

DateCases (7 Day Mean)Active CasesDeaths (7 Day Mean)
Sat 11 Dec 20215373 (3915.4)6564819 (18.7)
Sun 12 Dec 20213879 (3928.7)6811713 (17.3)
Mon 13 Dec 20212314 (3942.7)6853815 (17.4)
Tue 14 Dec 20213591 (3967.6)6575714 (16.4)
Wed 15 Dec 20215800 (4041.0)6796011 (15.9)
Thu 16 Dec 20215137 (4262.3)6967219 (15.3)
Fri 17 Dec 20214644 (4391.1)7040624 (16.4)
Sat 18 Dec 20215062 (4346.7)7044012 (15.4)
Sun 19 Dec 20214266 (4402.0)7298925 (17.1)
Mon 20 Dec 20212752 (4464.6)7370018 (17.6)
<img src="README_figs/README-plotNewCases-1.png" width="672" />

Example Usage

Read in the data

Using the data.table package to process the data.

# Load Libraries
library(data.table)
library(here)

# Read in data as a data.frame and data.table object.
CVPT <- fread(here("data", "covid19pt_DSSG_Long.csv"))
# You can use the direct link:
# CV <- fread("https://raw.githubusercontent.com/CEAUL/Dados_COVID-19_PT/master/data/covid19pt_DSSG_Long.csv")

# Looking at the key variables in the original long dataset.
CVPT[, .(data, origVars, origType, sex, ageGrp, region, value, valueUnits)]
##              data   origVars   origType sex ageGrp   region  value valueUnits
##     1: 2020-02-26     ativos     ativos All        Portugal     NA           
##     2: 2020-02-27     ativos     ativos All        Portugal     NA           
##     3: 2020-02-28     ativos     ativos All        Portugal     NA           
##     4: 2020-02-29     ativos     ativos All        Portugal     NA           
##     5: 2020-03-01     ativos     ativos All        Portugal     NA           
##    ---                                                                       
## 60420: 2021-12-16 vigilancia vigilancia All        Portugal  95430      Count
## 60421: 2021-12-17 vigilancia vigilancia All        Portugal  97573      Count
## 60422: 2021-12-18 vigilancia vigilancia All        Portugal  99081      Count
## 60423: 2021-12-19 vigilancia vigilancia All        Portugal 100339      Count
## 60424: 2021-12-20 vigilancia vigilancia All        Portugal 100955      Count

# Order data by original variable name and date.
setkeyv(CVPT, c("origVars", "data"))

# Convert data to a data object in dataset and add a change from previous day variable.
# Added a 7 day rolling average for origVars (except for symptoms). 
# Columns `data` is date in Portuguese.
CV <- CVPT[, data := as.Date(data, format = "%Y-%m-%d")][
  , dailyChange := value - shift(value, n=1, fill=NA, type="lag"), by = origVars][
    grepl("^sintomas", origVars), dailyChange := NA][
  , mean7Day := fifelse(origVars %chin% c("ativos", "confirmados", "obitos", "recuperados"), 
                         frollmean(dailyChange, 7), as.numeric(NA))]

Overall Number of Deaths (daily)

# Change the ggplot theme.
theme_set(theme_bw())
# Data error prevents by sex plot.
# obMF <- CV[origType=="obitos" & sex %chin% c("M", "F") & ageGrp=="" & region == "Portugal"]
obAll <- CV[origType=="obitos" & sex %chin% c("All") & ageGrp=="" & region == "Portugal"][ 
  , sex := NA]

obAll %>% 
  ggplot(aes(x = data, y = dailyChange)) +
  geom_bar(stat = "identity", fill = "grey75") +
  geom_line(data = obAll, aes(x = data, y = mean7Day), group=1, colour = "brown") +
  scale_x_date(date_breaks = "2 months",
               date_labels = "%b-%y",
               limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
  scale_y_continuous(breaks = seq(0, max(obAll[, dailyChange], na.rm = TRUE) + 50, 50)) +
  theme(legend.position = "bottom") +
  labs(
    title = "COVID-19 Portugal: Number Daily Deaths with 7 Day Rolling Mean",
    x = "",
    y = "Number of Deaths",
    colour = "",
    fill = "",
    caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])"))
    )
## Warning: Removed 1 rows containing missing values (`position_stack()`).
## Warning: Removed 7 rows containing missing values (`geom_line()`).
<img src="README_figs/README-deathsbySex-1.png" width="672" />

Recorded Number of Confirmed COVID-19 Cases by Age Group

CV[origType=="confirmados" & !(ageGrp %chin% c("", "desconhecidos"))][
  , .(valueFM = sum(value)), .(data, ageGrp)] %>%
  ggplot(., aes(x=data, y=valueFM, colour = ageGrp)) +
  geom_line() +
  scale_x_date(date_breaks = "2 months",
               date_labels = "%b-%y",
               limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
  scale_y_continuous(labels = scales::number_format(big.mark = ",")) +
  theme(legend.position = "bottom") +
  labs(
    title = "COVID-19 Portugal: Number of Confirmed Cases by Age Group",
    x = "",
    y = "Number of Confirmed Cases",
    caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])")),
    colour = "Age Group")
## Warning: Removed 54 rows containing missing values (`geom_line()`).
<img src="README_figs/README-casesbyAgeSex-1.png" width="672" />

Recorded Number of Confirmed COVID-19 Cases by Region

CV[origType=="confirmados" & ageGrp=="" & region!="Portugal"] %>%
  ggplot(., aes(x=data, y=value, colour=region)) +
  geom_line() +
  scale_x_date(date_breaks = "2 months",
               date_labels = "%b-%y",
               limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
  scale_y_log10(labels = scales::number_format(big.mark = ",")) +
  theme(legend.position = "bottom") +
  labs(
    title = "COVID-19 Portugal: Number of Confirmed Cases by Region",
    x = "",
    y = "Number of Confirmed Cases",
    caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])")),
    colour = "Region")
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 648 rows containing missing values (`geom_line()`).
<img src="README_figs/README-casesbyRegion-1.png" width="672" /> <hr>

Issues & Notes

Use and interpret with care.

The data are provided as is. Any quality issues or errors in the source data will be reflected in the user friend data.

Please create an issue to discuss any errors, issues, requests or improvements.

Calculated change between days can be negative (dailyChange).

CV[dailyChange<0 & !(origType %in% c("vigilancia", "internados"))][
  , .(data, origType, origVars, value, dailyChange)]
##             data    origType              origVars    value dailyChange
##    1: 2020-05-12      ativos                ativos 23737.00     -249.00
##    2: 2020-05-16      ativos                ativos 23785.00     -280.00
##    3: 2020-05-17      ativos                ativos 23182.00     -603.00
##    4: 2020-05-18      ativos                ativos 21548.00    -1634.00
##    5: 2020-05-22      ativos                ativos 21321.00     -862.00
##   ---                                                                  
## 1025: 2021-11-01          rt           rt_nacional     1.05       -0.03
## 1026: 2021-11-03          rt           rt_nacional     1.03       -0.02
## 1027: 2021-11-29          rt           rt_nacional     1.17       -0.02
## 1028: 2021-12-17          rt           rt_nacional     1.07       -0.01
## 1029: 2020-05-21 transmissao transmissao_importada   767.00       -3.00