Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
2
Andre X
andre930
Follow
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 22 hours ago
Learning to Hint for Reinforcement Learning
updated
a dataset
5 days ago
andre930/controller-rl-data
published
a dataset
12 days ago
andre930/controller-rl-data
View all activity
Organizations
None yet
andre930
's models
42
Sort: Recently updated
andre930/sft-controller
4B
•
Updated
21 days ago
•
159
andre930/rm_v5_ep3
8B
•
Updated
Mar 4
andre930/rm_v4_1ep
8B
•
Updated
Mar 4
•
1
andre930/rm_v4_3ep
8B
•
Updated
Mar 4
•
1
andre930/rm_v4_2ep
8B
•
Updated
Mar 4
andre930/rm_3ep
8B
•
Updated
Mar 4
•
1
andre930/rm_2ep
8B
•
Updated
Mar 4
•
1
andre930/rm_1ep
8B
•
Updated
Mar 4
andre930/rubrics_merge_rm_1_2500
8B
•
Updated
Feb 18
•
41
andre930/help_hon_overlap
Updated
Nov 27, 2025
•
1
andre930/mahdpo_help_hon_ortho_step7k
Updated
Nov 24, 2025
andre930/mahdpo_5_heads_step_2k_old
Updated
Nov 24, 2025
andre930/mahdpo_help_hon_ortho_step1k_old
Updated
Nov 24, 2025
andre930/mahdpo_help_hon_ortho_step4k
Updated
Nov 24, 2025
andre930/mahdpo_5_heads_step_1k
Updated
Nov 24, 2025
andre930/mahdpo_5_heads_step_4k
Updated
Nov 24, 2025
andre930/mahdpo_5_heads_step_2k
Updated
Nov 23, 2025
andre930/mahdpo_help_hon_ortho_step2k
Updated
Nov 23, 2025
andre930/mahdpo_5_heads_step_10k
Updated
Nov 23, 2025
andre930/mahdpo_help_hon_ortho_step1k
Updated
Nov 23, 2025
andre930/mahdpo_5_heads
Updated
Nov 23, 2025
andre930/dpo_help_hon_ortho
Updated
Nov 23, 2025
andre930/hv_ortho_soup
Updated
Nov 23, 2025
andre930/dpo_hon_ortho
Updated
Nov 23, 2025
andre930/dpo_help_ortho
Updated
Nov 23, 2025
andre930/math_dpo_soup
Updated
Nov 23, 2025
andre930/dpo_math_eng
Updated
Nov 23, 2025
andre930/dpo_math_acc
Updated
Nov 23, 2025
andre930/hv_dpo_soup
Updated
Nov 20, 2025
andre930/mahdpo_truth
Updated
Nov 20, 2025
Previous
1
2
Next