Home

Awesome

German-NLP

Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German Awesome

Resources and tools which can be used either off-the-shelf or with minor adjustments and which are currently maintained are primarily chosen for this list. It is deliberately biased in terms of usability and user-friendliness.

Community support is needed to keep this list up-to-date, pull requests and suggestions are welcome! See contributing guidelines.

Table of Contents

Text corpora

General-purpose

Historical

Specialized

Swiss German

Learner and Error Corpora

Word lists

Data acquisition

Lists of corpora

Generic resources

Frameworks

Treebanks

Deep learning models and transformers

Annotation

Standards

Linguistic processing

Preprocessing

Tokenization / Sentence boundary detection

Stemming

Lemmatization

Morphological analysis

Normalization

Phonology

POS-tagging

Syntactical parsing

Named Entity Recognition

Misc

Text generation

Industry/Applications

Evaluation

Semantic analysis

Datasets

Word embeddings and senses

Sentiment analysis datasets / polarity clues

Sentiment detection

GermEval

(category to improve)

Discourse

Summarization and Simplification

Psycholinguistics

Speech NLP

Machine Translation

(category to improve)

Parallel corpora

Large Language Models

Teaching resources and tutorials

More lists

German

General

Comparable lists

Larger institutional GitHub groups

Contributors

See the list of contributors.

License

CC-BY