Home

Awesome

EESW2019_tutorial

<a href="https://www.amazon.com/Statistical-Data-Cleaning-Applications-R/dp/1118897153"><img align="right" src="src/00tex/sdcr.jpg" width=200></a>

Materials for the short course on Statistical Data Cleaning for Business Statistics at the European Establishment Statistics Workshop 2019


Contents

Slot 1

Topictime (m)
Introduction20
Reading dirty data30
Approximate matching50
Data validation50

Slot 2

Topictime (m)
Error localization20
Imputation50
Adjusting20
Monitoring30
Wrap-up10

Course form

The course form is highly hands-on. Each topic starts with an approximately 10-15 minute session where you run and adapt some R code. Next, I will provide background and details on what you just did. After that there is a more in-depth assignment. Depending on time and topic we will discuss the topic more in-depth after that.

Prerequisites

Bring a laptop

Participants are expected to have a basic knowledge of R/RStudio, explicitly:

Software needed for the course

  1. R See https://r-project.org
  2. (Recommended) Rstudio

Execute the following R code to install the necessary packages.

install.packages(c(
        "validate"
      , "errorlocate"
      , "simputation"
      , "rspa"
      , "daff"
      , "jsonlite"
      , "XML"
      , "readr"
      , "stringr"
      , "lumberjack")
  , dependencies=TRUE)

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.