Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Xiaoyang Cao's picture
5

Xiaoyang Cao

Sean13
·
https://xiaoyangcao1113.github.io/
  • XiaoyangCao1113
  • xiaoyangcao

AI & ML interests

RLFH, Deep Reinfrocement Learning

Recent Activity

updated a model 1 day ago
Sean13/role-drift-compound-systems
published a model 1 day ago
Sean13/role-drift-compound-systems
updated a model 1 day ago
Sean13/maxrl_curriculum_Qwen3-1.7B
View all activity

Organizations

None yet

models 69

Sean13/role-drift-compound-systems

Updated 1 day ago

Sean13/maxrl_curriculum_Qwen3-1.7B

2B • Updated 1 day ago • 12

Sean13/grpo_curriculum_Qwen3-1.7B

2B • Updated 2 days ago • 25

Sean13/repo-best-llama-re-dpo

Updated Feb 26

Sean13/repo-best-llama-dpo

Updated Feb 26

Sean13/repo-best-mistral-dpo

Updated Feb 26

Sean13/repo-best-mistral-re-dpo

Updated Feb 26

Sean13/repo-best-model

Updated Feb 26

Sean13/llama-8b-instruct-v0.2-cpo-full-label_smoothing-0.1

Text Generation • 266k • Updated Nov 21, 2025 • 2

Sean13/mistral-7b-instruct-v0.2-cpo-full-label_smoothing-0.1

Text Generation • 266k • Updated Nov 21, 2025 • 2
View 69 models

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs