CFPO-RLHF - a asparius Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

asparius 's Collections

CFPO-RLHF

updated Feb 7

RLHF Checkpoints from Clipping Free Policy Optimization for Large Language Models

asparius/Llama3-8b-openrlhf-rloo

266k • Updated Nov 30, 2025 • 2
asparius/Llama3-8b-openrlhf-rloo-kl0

266k • Updated Dec 2, 2025 • 2
asparius/Llama3-8b-openrlhf-cfpo

266k • Updated Dec 1, 2025 • 1
asparius/Llama3-8b-openrlhf-cfpo-kl0

266k • Updated Dec 4, 2025 • 2

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs