Awesome
Party positions from Wikipedia classifications
Herrmann, Michael, and Holger Döring. 2023. “Party Positions from Wikipedia Classifications of Party Ideology.” Political Analysis 31(1): 22–41. — doi: 10.1017/pan.2021.28
Holger Döring, and Michael Herrmann. [YEAR] “Party Positions from Wikipedia Tags.” — doi: 10.5281/zenodo.7043510
- Holger Döring — holger.doering@gesis.org
- Michael Herrmann — michael.herrmann@uni-konstanz.de
Results
- party positions and tags in party-position-tags.csv
- tag positions in tag-position.csv
- visualization of parties by country and tags
Install
Running all scripts requires R, Python and Stan.
We use Docker as a replication environment. It includes R, RStudio, Python, Stan and all packages (see Dockerfile).
docker-compose up -d # start container in detached mode
docker-compose down # shut down container
http://localhost:8787/ — RStudio in a browser with all dependencies
Project structure
Note — Using RStudio project workflow – 0-wp-data.Rproj. All R scripts use project root as base path and file paths are based on it.
- z-run-all.R — stepwise execution all scripts (R and Python)
- data-files-docs.csv — documentation all datasets (path, type, description)
Folders
- 01-data-sources
- 01-partyfacts — Party Facts data
- 02-wikipedia — Wikipedia data and infobox tags
- 03-party-positions — party position data for validation (CHES, DALP, Manifesto, WVS)
- 02-data-preparation — create datasets for analysis
- 03-estimation — estimation of models and post-estimation
- 04-data-final — datasets with party and tags positions (only M2)
- 05-validation — validation of party positions (only M2)
- 06-figures-tables — visualization of results (only M2)
Tag harmonization
A dataset of Wikipedia tags is created in 02-data-preparation/01-wp-infobox.R.
- some minor harmonization of category names
- selects only categories that are used twice
The dataset used for the analysis is created in 02-data-preparation/02-wp-data.R.
- filter most frequent tags — see parameter
- create dataset in wide format with tags as variable names
Estimation
Model 2 (and Model 1) can be estimated in 03-estimation.
We use only Model 2 for post-estimation and the succeeding preparation of final data, figures and tables.
Party positions
We include party position data for validation — see 01-data-sources/03-party-positions/
- Chapel Hill Expert Survey (CHES) – trend file 1999–2019
- Democratic Accountability and Linkages Project (DALP) expert survey (Kitschelt 2013)
- Manifesto Project (MP) – left-right (rile) scores
- World Values Survey (WVS) — voters left-right self-placement, Wave 6, 2010–2014
Changes
Differences of revised code with paper-based code used in replication material:
Herrmann, Michael, and Holger Döring. 2021. “Replication Data for: Party Positions from Wikipedia Classifications of Party Ideology.” — doi: 10.7910/DVN/1JHZIU
Data
- new (revised) main final dataset — 04-descriptives/party-tags-positions.csv
- remove historical and faction tags sections
Code
- Stan statistical computing platform used for estimation (JAGS deprecated)
- new folder structure with index numbers
- fewer R packages dependencies
- focus on Model 2 (Model 1 estimation only)
- removed tables and figures only relevant for paper
- revised documentation all scripts
License
MIT — Copyright (c) 2022 Holger Döring and Michael Herrmann