Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
wh-zhu
's Collections
PSFT
Realigner-TrRa
Weak-to-Strong
Realigner-InRa
PSFT
updated
Mar 2
PSFT+RL models
Upvote
-
wh-zhu/Qwen2.5-7B-PSFT-RL-DAPO-90
Text Generation
•
8B
•
Updated
24 days ago
•
226
wh-zhu/Qwen2.5-7B-Instruct-PSFT-1300
8B
•
Updated
Jul 26, 2025
•
4
wh-zhu/Qwen2.5-7B-SFT-RL-DAPO-90
8B
•
Updated
Aug 13, 2025
•
5
wh-zhu/Qwen2.5-7B-Instruct-SFT-700
8B
•
Updated
Jul 26, 2025
•
1
wh-zhu/llama3.1-8B-PSFT-dapo90
8B
•
Updated
Aug 13, 2025
•
4
wh-zhu/Llama3.1-8B-Instruct-PSFT-1500
8B
•
Updated
Jul 26, 2025
•
1
wh-zhu/Llama3.1-8B-Instruct-SFT-1200
8B
•
Updated
Jul 27, 2025
•
1
wh-zhu/llama3.1-8B-SFT-dapo100
8B
•
Updated
Aug 14, 2025
•
2
wh-zhu/LLama3.1-8B-Instruct-SFT200warmup-PSFT
8B
•
Updated
Aug 19, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-SFT100warmup-PSFT
8B
•
Updated
Aug 19, 2025
•
1
Upvote
-
Share collection
View history
Collection guide
Browse collections