We train OPT-1.3B using three datasets: Dahoas/rm-static, Dahoas/full-hh-rlhf, and yitingxie/rlhf-reward-datasets.

Dahoas/synthetic-instruct-gptj-pairwise is not used because of the adsence of test dataset.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support