Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
AIPlans 's Collections
Cross Coders
Model Diffing Project
Post Training Versions - Qwen 0.6B
Red Teaming Alignment Evals
Model Diffing

Post Training Versions - Qwen 0.6B

updated Mar 27

Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database will be the HelpSteer2 dataset

Upvote
1

  • AIPlans/Qwen3-0.6B-ORPO

    Text Generation • Updated Nov 28, 2025 • 10

  • AIPlans/Qwen3-0.6B-DPO_NOTLORA

    Text Generation • 0.6B • Updated Nov 25, 2025 • 5

  • AIPlans/Qwen3-0.6B-GRPO_Epoch2

    Text Generation • 0.6B • Updated Dec 18, 2025 • 6

  • AIPlans/Qwen3-0.6B-ReMax

    Reinforcement Learning • 0.6B • Updated Dec 22, 2025 • 10 • 2

  • AIPlans/Qwen3-0.6B-IPO

    Reinforcement Learning • 0.6B • Updated Dec 12, 2025 • 21 • 1

  • AIPlans/Qwen3-0.6B-KTO

    Text Generation • Updated Nov 22, 2025 • 8 • 1

  • AIPlans/Qwen3-0.6B-PPO

    Text Generation • 0.6B • Updated Mar 27 • 90 • • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs