Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
sr5434 's Collections
RLHF Models
Non-English Llamas
Codellamas

RLHF Models

updated 3 days ago

A set of models from my experiments with Reinforcement Learning from Human Feedback

Upvote
-

  • sr5434/sft_model

    Text Generation • 0.3B • Updated 12 days ago • 496

  • sr5434/rm_hh_rlhf

    Text Classification • 0.3B • Updated 11 days ago • 115

  • sr5434/rlhf_policy

    Text Generation • 0.3B • Updated 5 days ago • 122
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs