Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
lisali126
/
DDR1_Q1.5B-GRPOFixReward
like
0
Safetensors
qwen2
Model card
Files
Files and versions
xet
Community
main
DDR1_Q1.5B-GRPOFixReward
/
tokenizer.json
Commit History
Training in progress, step 20
168c976
verified
lisali126
commited on
Dec 9, 2025