SAGE - a baohao Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

baohao 's Collections

SAGE

updated Mar 27

Self-Hinting Language Models Enhance Reinforcement Learning

baohao/aime24

Viewer • Updated Feb 7 • 30 • 188
baohao/aime25

Viewer • Updated Feb 7 • 30 • 174
baohao/amc23

Viewer • Updated Feb 7 • 40 • 179
baohao/olympiadbench

Viewer • Updated Feb 7 • 675 • 152
baohao/minerva_math

Viewer • Updated Feb 7 • 272 • 132
baohao/math500

Viewer • Updated Feb 7 • 500 • 175
baohao/gpqa

Viewer • Updated Feb 7 • 198 • 89
baohao/mmlu_pro

Viewer • Updated Feb 7 • 12k • 33
baohao/sage_train

Viewer • Updated Feb 26 • 15k • 44
baohao/luffy_train

Viewer • Updated Feb 26 • 15k • 18
baohao/scaf-grpo_train

Viewer • Updated Feb 7 • 15k • 59
Self-Hinting Language Models Enhance Reinforcement Learning

Paper • 2602.03143 • Published Feb 3 • 31
baohao/sage_validation

Viewer • Updated Feb 26 • 1.67k • 69
baohao/SAGE_Llama-3.2-3B-Instruct

4B • Updated Feb 26 • 18
baohao/SAGE_Qwen2.5-7B-Instruct

8B • Updated Feb 26 • 5
baohao/SAGE_Qwen3-4B-Instruct-2507

4B • Updated Feb 26 • 2
baohao/SAGE-light_Qwen2.5-7B-Instruct

8B • Updated Feb 26 • 5 • 2
baohao/SAGE-light_Llama-3.2-3B-Instruct

4B • Updated Feb 26 • 4
baohao/SAGE-light_Qwen3-4B-Instruct-2507

4B • Updated Feb 26 • 3
baohao/SFT_Qwen3-4B-Instruct-2507

4B • Updated Mar 27 • 2
baohao/LUFFY_Qwen3-4B-Instruct-2507

4B • Updated Mar 27 • 11
baohao/GRPO_Qwen3-4B-Instruct-2507

4B • Updated Mar 27 • 4
baohao/Scaf-GRPO_Qwen3-4B-Instruct-2507

4B • Updated Mar 27 • 6

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs