Home

Awesome

Awesome Human Preference Datasets for LLM 🧑❤️🤖

A curated list of open source Human Preference datasets for LLM instruction-tuning, RLHF and evaluation.

For general NLP datasets and text corpora, check out this awesome list.

Datasets

OpenAI WebGPT Comparisons

OpenAI Summarization

Anthropic Helpfulness and Harmlessness Dataset (HH-RLHF)

OpenAssistant Conversations Dataset (OASST1)

Stanford Human Preferences Dataset (SHP)

Reddit ELI5

Human ChatGPT Comparison Corpus (HC3)

HuggingFace H4 StackExchange Preference Dataset

ShareGPT.com

Alpaca

GPT4All

Databricks Dolly Dataset

HH_golden