HuggingFaceH4/ultrachat_200k
Viewer
•
Updated
•
515k
•
28.6k
•
632
Curated SFT datasets for instruction-following and conversational fine-tuning
Note Multi-turn conversations, 515K samples, MIT license - Best for chat
Note GPT-4 generated, 1M samples, Apache 2.0 - High quality general purpose
Note GPT-3.5/4 augmented FLAN, 2.94M samples, MIT - Large scale training
Note Curated subset, 100K samples, Apache 2.0 - Quick experiments