Home

Awesome

Paper Conference version

Paper Conference version

GerParCor

GerParCor

German Parliamentary Corpus (GerParCor)

Abstract

In 2022, the largest German-speaking corpus of parliamentary protocols from three different centuries, on a national and federal level from the countries of Germany, Austria, Switzerland and Liechtenstein, was collected and published - GerParCor. Through GerParCor, it became possible to provide for the first time various parliamentary protocols which were not available digitally and, moreover, could not be retrieved and processed in a uniform manner. Furthermore, GerParCor was additionally preprocessed using NLP methods and made available in XMI format. In this paper, GerParCor is significantly updated by including all new parliamentary protocols in the corpus, as well as adding and preprocessing further parliamentary protocols previously not covered, so that a period up to 1797 is now covered. Besides the integration of a new, state-of-the-art and appropriate NLP preprocessing for the handling of large text corpora, this update also provides an overview of the further reuse of GerParCor by presenting various provisioning capabilities such as API’s, among others.

GerParCor is available via https://gerparcor.texttechnologylab.org

GerParCor 2022

GerParCor 2022 is available via http://lrec2022.gerparcor.texttechnologylab.org

#ParliamentSessionsFromUntilStatus / Download
1Reichstag (NG + Zoll)199002/25/186705/24/1895Download
2Reichstag (Empire)218312/03/189510/26/1918Download
3Weimar Republic132802/06/191912/09/1932Download
4ThirdReich2003/21/193304/24/1942Download
5Bundesrat100809/07/194910/08/2021Download
6Bundestag415809/07/194909/07/2021Download
7Baden-Würtemberg41206/05/198409/29/2021Download
8Bayern222112/16/194610/14/2021Download
9Berlin58204/02/198909/16/2021Download
10Brandenburg44210/26/199008/27/2021Download
11Bremen110207/04/199509/16/2021Download
12Hamburg58610/08/199711/03/2021Download
13Hessen129702/04/194709/29/2021Download
14Mecklenburg-Vorpommern65910/26/199006/11/2021Download
15Niedersachsen110906/22/198209/15/2021Download
16Nordrhein-Westfalen204105/21/194710/08/2021Download
17Rheinland-Pfalz156207/24/194709/22.2021Download
18Saarland87607/23/195909/15/2021Download
19Sachsen69010/27/199011/18/2021Download
20Sachsen-Anhalt60710/28/199009/17/2021Download
21Schleswig-Holstein177602/26/194602/11/2021Download
22Thüringen76110/25/199011/19/2021Download
23Liechtenstein50403/13/199711/06/2021Download
24Nationalrat (AT)426710/21/191805/17/2021Download
25Nationlarat (CH)36812/06/199912/09/2021Download

Cite

If you want to use the project or the corpus, please quote this as follows:

BibTeX

@InProceedings{Abrami:Bagci:Hammerla:Mehler:2022,
  author         = {Abrami, Giuseppe and Bagci, Mevl\"{u}t and Hammerla, Leon and Mehler, Alexander},
  title          = {German Parliamentary Corpus (GerParCor)},
  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
  month          = {June},
  year           = {2022},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages          = {1900--1906},
  url            = {https://aclanthology.org/2022.lrec-1.202}
}

@inproceedings{Abrami:et:al:2024,
    address   = {Torino, Italy},
    author    = {Abrami, Giuseppe and Bagci, Mevl{\"u}t and Mehler, Alexander},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    editor    = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen},
    month     = {may},
    pages     = {7707--7716},
    publisher = {ELRA and ICCL},
    title     = {{G}erman Parliamentary Corpus ({G}er{P}ar{C}or) Reloaded},
    url       = {https://aclanthology.org/2024.lrec-main.681},
    year      = {2024}
}