Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mindchain 's Collections
Bread&Butter
Google Gemma Scope 2 - Neuronpedia
Haddock Custom Sparse Autodecoders
Google Gemma - Quantized
Nemo-Gym
Reward Models
Trained
Encoder/Decoder Architecture
Gemma’s Soul-Vault: Evolutionary JumpReLU Steering Hub
Mamba/Transformers Combo
EDGE - Funktion Calling

Reward Models

updated about 21 hours ago

NVIDIA Nemotron reward models: 340B, 8B BRRM, 70B/32B principle-based. RLHF training, preference learning, AI alignment research.

Upvote
1

  • nvidia/Nemotron-4-340B-Reward

    Updated Jun 19, 2024 • 34 • 125

  • nvidia/Qwen3-Nemotron-8B-BRRM

    Text Generation • Updated 11 days ago • 724 • 8

  • nvidia/Llama-3.3-Nemotron-70B-Reward-Principle

    Text Generation • 71B • Updated Oct 30 • 85 • 5

  • nvidia/Qwen3-Nemotron-32B-GenRM-Principle

    Text Generation • 33B • Updated Oct 30 • 838 • 11

  • nvidia/Qwen3-Nemotron-32B-RLBFF

    Text Generation • 33B • Updated Oct 31 • 122 • 27

  • nvidia/Qwen3-Nemotron-14B-BRRM

    Text Generation • Updated 11 days ago • 209 • 11

  • nvidia/HelpSteer3

    Viewer • Updated Nov 16 • 133k • 2.62k • 93
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs