Awesome

Party positions from Wikipedia classifications

Herrmann, Michael, and Holger Döring. 2023. “Party Positions from Wikipedia Classifications of Party Ideology.” Political Analysis 31(1): 22–41. — doi: 10.1017/pan.2021.28

Holger Döring, and Michael Herrmann. [YEAR] “Party Positions from Wikipedia Tags.” — doi: 10.5281/zenodo.7043510

Holger Döring — holger.doering@gesis.org
Michael Herrmann — michael.herrmann@uni-konstanz.de

Results

party positions and tags in party-position-tags.csv
tag positions in tag-position.csv
visualization of parties by country and tags

Install

Running all scripts requires R, Python and Stan.

We use Docker as a replication environment. It includes R, RStudio, Python, Stan and all packages (see Dockerfile).

docker-compose up -d  # start container in detached mode

docker-compose down   # shut down container

http://localhost:8787/ — RStudio in a browser with all dependencies

Project structure

Note — Using RStudio project workflow – 0-wp-data.Rproj. All R scripts use project root as base path and file paths are based on it.

z-run-all.R — stepwise execution all scripts (R and Python)
data-files-docs.csv — documentation all datasets (path, type, description)

Folders

01-data-sources
- 01-partyfacts — Party Facts data
- 02-wikipedia — Wikipedia data and infobox tags
- 03-party-positions — party position data for validation (CHES, DALP, Manifesto, WVS)
02-data-preparation — create datasets for analysis
03-estimation — estimation of models and post-estimation
04-data-final — datasets with party and tags positions (only M2)
05-validation — validation of party positions (only M2)
06-figures-tables — visualization of results (only M2)

Tag harmonization

A dataset of Wikipedia tags is created in 02-data-preparation/01-wp-infobox.R.

some minor harmonization of category names
selects only categories that are used twice

The dataset used for the analysis is created in 02-data-preparation/02-wp-data.R.

filter most frequent tags — see parameter
create dataset in wide format with tags as variable names

Estimation

Model 2 (and Model 1) can be estimated in 03-estimation.

We use only Model 2 for post-estimation and the succeeding preparation of final data, figures and tables.

Party positions

We include party position data for validation — see 01-data-sources/03-party-positions/

Chapel Hill Expert Survey (CHES) – trend file 1999–2019
Democratic Accountability and Linkages Project (DALP) expert survey (Kitschelt 2013)
Manifesto Project (MP) – left-right (rile) scores
World Values Survey (WVS) — voters left-right self-placement, Wave 6, 2010–2014

Changes

Differences of revised code with paper-based code used in replication material:

Herrmann, Michael, and Holger Döring. 2021. “Replication Data for: Party Positions from Wikipedia Classifications of Party Ideology.” — doi: 10.7910/DVN/1JHZIU

Data

new (revised) main final dataset — 04-descriptives/party-tags-positions.csv
remove historical and faction tags sections

Code

Stan statistical computing platform used for estimation (JAGS deprecated)
new folder structure with index numbers
fewer R packages dependencies
focus on Model 2 (Model 1 estimation only)
removed tables and figures only relevant for paper
revised documentation all scripts

datasets