soldni's picture
Update README.md
bb89085 verified
metadata
license: apache-2.0
language:
  - en
library_name: fasttext
tags:
  - quality-classification
datasets:
  - teknium/OpenHermes-2.5
  - HuggingFaceH4/ultrachat_200k
  - vincentmin/eli5_rlhf
  - mlfoundations/dclm-baseline-1.0
  - allenai/WildChat-1M

Dolma 3 Quality Classifier

A model trained to classify document as high or low qualily. Part of the Dolma 3 pipeline.