Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
5
Xiaoyang Cao
Sean13
Follow
0 followers
·
2 following
https://xiaoyangcao1113.github.io/
XiaoyangCao1113
xiaoyangcao
AI & ML interests
RLFH, Deep Reinfrocement Learning
Recent Activity
updated
a model
about 1 month ago
Sean13/responsibility-decomposition
published
a model
about 1 month ago
Sean13/responsibility-decomposition
updated
a model
about 2 months ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
View all activity
Organizations
None yet
Sean13
's models
73
Sort: Recently updated
Sean13/mistral-7b-instruct-v0.2-rdpo-full-alpha1.0
7B
•
Updated
Sep 22, 2025
•
1
Sean13/mistral-7b-instruct-v0.2-rdpo-full-alpha0.9
7B
•
Updated
Sep 19, 2025
•
3
Sean13/mistral-7b-instruct-v0.2-rdpo-full-alpha0.7
7B
•
Updated
Sep 19, 2025
•
1
Sean13/mistral-7b-instruct-v0.2-rdpo-full-alpha0.3
Updated
Sep 19, 2025
Sean13/mistral-7b-instruct-v0.2-rcpo-full
Text Generation
•
7B
•
Updated
Sep 15, 2025
•
3
Sean13/mistral-7b-instruct-v0.2-cpo-full
Text Generation
•
7B
•
Updated
Sep 11, 2025
•
2
Sean13/mistral-7b-instruct-v0.2-simpo-full
Text Generation
•
7B
•
Updated
Sep 6, 2025
•
3
Sean13/mistral-7b-instruct-v0.2-rsimpo-full
Text Generation
•
7B
•
Updated
Sep 6, 2025
•
2
Sean13/mistral-7b-instruct-v0.2-ipo-full
Text Generation
•
7B
•
Updated
Aug 19, 2025
•
3
Sean13/mistral-7b-instruct-v0.2-slic_hf-full
Text Generation
•
7B
•
Updated
Aug 11, 2025
•
3
Sean13/mistral-7b-instruct-v0.2-rslic_hf-full
Updated
Aug 8, 2025
Sean13/mistral-7b-instruct-v0.2-ripo-full
Text Generation
•
7B
•
Updated
Aug 3, 2025
•
1
Sean13/mistral-7b-instruct-v0.2-emdpo-full
7B
•
Updated
Jul 24, 2025
•
2
Previous
1
2
3
Next