Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
davidoj01
/
unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as8
like
0
Text Generation
Transformers
Safetensors
open-r1/verifiable-coding-problems-python
llama
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as8
Commit History
End of training
ee59ed7
verified
davidoj01
commited on
Apr 4, 2025
Model save
19e6d7d
verified
davidoj01
commited on
Apr 4, 2025
Training in progress, step 500
6f7da64
verified
davidoj01
commited on
Apr 4, 2025
Training in progress, step 450
3d9275f
verified
davidoj01
commited on
Apr 4, 2025
Training in progress, step 400
417b40d
verified
davidoj01
commited on
Apr 3, 2025
Training in progress, step 350
4b8a4f5
verified
davidoj01
commited on
Apr 3, 2025
Training in progress, step 300
daa987a
verified
davidoj01
commited on
Apr 3, 2025
Training in progress, step 250
1942554
verified
davidoj01
commited on
Apr 3, 2025
Training in progress, step 200
9b25352
verified
davidoj01
commited on
Apr 3, 2025
Training in progress, step 150
91a78af
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 100
23a1d55
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 50
2458cce
verified
davidoj01
commited on
Apr 2, 2025
initial commit
d33ee17
verified
davidoj01
commited on
Apr 2, 2025