Home

Awesome

🇩🇪 🦙 🛁 Cleaned German Alpaca Dataset

Welcome to the Cleaned German Alpaca Dataset repository! This repository hosts cleaned, curated and translated versions of the Cleaned Alpaca Dataset.

Datasets

Dataset 1

Dataset 2

JSON attributes:

Contributions

With over 52k entries, several issues still exist. Please help out by submitting a pull-request.

Goals

The primary goal of this project is to provide a cleaned and curated version of a German Alpaca dataset that will improve the performance of NLP models trained on this data. By removing errors and inconsistencies, the goal is to improve performance of the fine-tuned models.

Acknowledgments

We would like to thank the authors of the Cleaned Alpaca dataset for their effort.

We would like to thank the original creators of the Alpaca datasets for making their data available to the public.

Licensing

The Cleaned German Alpaca Dataset is licensed under CC BY NC 4.0.

The software and tools in this repository is licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.