Home

Awesome

The qogdata package is a collection of functions to manipulate Quality of Government (QOG) data and related material. It lets the user access QOG datasets and codebooks and contains helpers to search and prepare QOG variables. The package also contains methods to merge and plot time series data, and provides a few examples of how these methods can be deployed to import World Bank and country-level indicators into QOG datasets.

INSTALL

Version 0.1.5 of the qogdata package is installable with the devtools package:

library(devtools)
install_github("qogdata", "briatte", dependencies = TRUE)
require(qogdata)

Installing dependencies is particularly recommended to work with country-level data. If loading fails after running devtools, install dependencies manually:

deps = c("countrycode", "foreign", "ggplot2", "maps", "WDI")
lapply(deps, install.packages)

HOWTO

This document describes all currently available functions, which are also internally documented. Additional notes are provided in the repository wiki. Please post comments, issues and suggestions in the Issues section. Feel free to email me if you have any additional queries or ideas on how the qogdata package could be developed.

qogdata

  1. Install and load the xtdata package.
  2. Type qogdata(TRUE) to download the QOG Standard dataset.
  3. Type qogbook(TRUE) to download the QOG Standard codebook.

Details:

qogdata simply points to a QOG server to download available versions of the QOG datasets. By default, it simply returns the path to the QOG Standard cross-section dataset:

> qogdata()
[1] "http://www.qogdata.pol.gu.se/data/QoG_std_cs_15May13.csv"

Set file to TRUE or to a specific filename to download the dataset:

> QOG = qogdata(file = TRUE, format = "ts", codebook = TRUE)
Downloading http://www.qogdata.pol.gu.se/data/QoG_std_ts_15May13.csv...
Loaded qog_std_ts_15May13.csv (N = 14137, 1946-2012, T = 67).

Set codebook to TRUE or to a specific filename to also download the codebook with the qogbook function:

> QOG = qogdata(file = TRUE, format = "ts", codebook = TRUE)
Downloading http://www.qogdata.pol.gu.se/data/QoG_std_ts_15May13.csv...
Loaded qog_std_ts_15May13.csv (N = 14137, 1946-2012, T = 67).
Downloading codebook to Codebook_QoG_Std15May13.pdf...
Codebook: Codebook_QoG_Std15May13.pdf

Notes:

qogfind

qogfind uses two indexes bundled with the package to help the user find variables in QOG datasets:

> qogfind("public|administration")
QOG Standard results:
     variable                            label ts.min ts.max ts.N ts.T cs.N cs.min cs.max
217   gir_acs Administration and Civil Service   2004   2011  251    8   91   2006   2011
291 irai_epru    Equity of Public Resource Use   2005   2011  532    7   80   2006   2011
301  irai_qpa Quality of Public Administration   2005   2011  532    7   80   2006   2011
304  irai_tac Transparency, Accountability and   2005   2011  532    7   80   2006   2011
460  wdi_puhe Public Health Expenditure (% of    1995   2010 2960   16  187   2009   2009
710  wvs_f115 Justifiable: avoiding a fare on    1981   2008  157   28   NA     NA     NA
716   wvs_pet           Public self-expression   1981   2008  161   28   NA     NA     NA

The function searches through variable names and labels, as the lookfor command would in Stata. The ts columns provides years of measurement for the time-series dataset, the cs columns for the cross-sectional dataset. The information matches the figures reported in the QOG Standard Codebook and QOG Social Policy Codebook. It can be easily plotted:

qplot(data = na.omit(qogfind("ims_", version = "soc")), 
      y = label, yend = label, x = as.numeric(ts.min), xend = as.numeric(ts.max), 
      geom = "segment", size = I(6), alpha = ts.T) +
  scale_alpha("Year range") +
  theme_minimal(16) +
  labs(y = NULL, x = NULL, title = "Data availability") + 
  theme(legend.position = "bottom")

qogjoin

qogjoin is a highly idiosyncratic method to join historical to recent states in the QOG Standard time series dataset. Only states that went through a single territorial departure (e.g. France/Algeria or Pakistan/Bangladesh) are joined: more complex state partition cases like North/South Sudan and North/South Yemen are left aside.

In the current Standard version of the dataset (15 May 2013), this will cause a merge between the split versions of Ethiopia, France, Malaysia and Pakistan. This will make the dataset more backward compatible with older versions of the data where this separation did not exist.

xtdata, xtset, xt, xtdes

xtdata and its related functions are a way to specify panel data properties into the xtdata attribute of a data frame attribute, which makes it possible to:

Please consult the repository wiki for details on xt methods in the qogdata package, and how it might turn into a proper class by version 1.0. The method is intended for use with panel data: users who want to work with full-fledged time series should use the zoo or xts packages.

xtshift, xtlag, xtlead, xtdecay, xttse

xtshift, xtlag and xtlead are functions to shift (lag or lead) a panel variable. xtdecay and xttse (time since event) are additional time series functions for panel data. These functions, as well as several other little utilities used in the qogdata package, are based on code found online (see the package documentation for the sources).

xtmerge, xtsubset, xtsample

xtmerge performs a merge of two panel datasets based on their xtdata attributes, checking for identically formatted data identifiers and time periods before performing the merge, which otherwise works like merge in base R.

When xtmerge is provided datasets of type country with different country code formats, it runs the xtcountry helper function to determine the best ISO-3N conversion match and performs the merge on the new iso3n variable.

xtsubset is a wrapper for subset that preserves the xtdata attribute of a data frame. xtsample is a wrapper for sample that selects all observations from a random sample of unique identifiers (thereby preserving the time structure of the data).

xtmap, xtplot

xtmap calls the countrycode, maps and ggplot2 packages to draw choropleth maps, with helper transformations for time series. The examples provided below use QOG cross-sectional data:

QOG = qogdata(tempfile(), version = "bas", variables = c("ccodealp", "undp_hdi", "ihme_nm"))

The result is a ggplot2 object with the map and data plotted together:

xtmap(subset(QOG, ccodealp != "RUS"), "ihme_nm", continent = "Asia", iso3n = "ccode") +
  ggtitle("Neonatal Mortality Rate per 1,000 births (IHME, 2009))")

Use the quantize option to map quantiles of a continuous variable on the fly:

xtmap(QOG, "undp_hdi", quantize = 3, continents = c("Africa", "Asia"), iso3n = "ccode") +
  scale_fill_brewer("", palette = "RdYlBu", labels = c("Low", "Med", "High")) +
  ggtitle("Human Development Index (UNDP, 2009-2010)")

The function matches countries to geographic information from the world map provided in the maps package, and currently suffers from a little map projection bug as soon as you include Russia. It also adds continents and regions with the countrycode package to allow plots of specific geographic areas.

When provided with a data frame carrying the xtdata attribute, the function currently uses the maximum time period as t, as with 'most recent year' with country-level data. If quantize.t is used to create time intervals, the function plots facets of maps.

xtplot is a stub for a similar function that plots time series out of data frames carrying an xtdata attribute.

xtmissing

xtmissing plots a basic tile plot of missing and nonmissing values in a data frame with an xtdata attribute, with observations in rows and ordered time periods in columns.

get_eurostat

get_eurostat is a soon-to-be-added downloader for Eurostat data. See the repository wiki for development notes, or see the eurostat_r script to bulk import Eurostat data in R.

get_wdi

get_wdi calls the WDI package to download one or more World Development Indicators provided through the World Bank API. The result is a data frame with an xdtata attribute ready for merging onto ISO-3N country codes, converted from the ISO-3C country codes returned by WDI.

This function makes it easy to update a WDI within a QOG dataset to the latest measurements, and to compare measurement differences between the latest QOG and WDI series:

The WDI data for this plot was retrieved with get_wdi as follows:

WDI = get_wdi(x = "SH.XPD.PCAP.PP.KD", add = "income")

The code for the comparative measurements plot is in the documentation of the get_wdi function.

get_uds

get_uds downloads the Unified Democracy Scores (UDS) from Pemstein et al.. The result is a data frame with an xdtata attribute ready for merging onto Correlates of War (COW) numeric country codes.

get_gwpt

get_gwpt downloads state independence data from Gleditsch & Ward and state coups data from Powell & Thyne. The result is a data frame with an xdtata attribute that is idiosyncratic (Gleditsch and Ward country codes), but might still be merged with xtcountry and xtmerge (see above).

CREDITS

qogdata takes inspiration from two QOG packages for Stata users, qog by Christoph Thewes and qogbook by Richard Svensson. Further credits are due to the authors of the QOG datasets, and a list of related R packages appears on the wiki.