Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
davidoj01
/
unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as2
like
0
Text Generation
Transformers
Safetensors
open-r1/verifiable-coding-problems-python
llama
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as2
Commit History
End of training
5fb2a81
verified
davidoj01
commited on
Apr 2, 2025
Model save
83dd45b
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 500
0d28500
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 450
2d7dd45
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 400
d2167da
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 350
717f136
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 300
e4df48d
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 250
dde0c6c
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 200
21c5b93
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 150
fb18b71
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 100
54b49bc
verified
davidoj01
commited on
Apr 2, 2025
Training in progress, step 50
c4e290d
verified
davidoj01
commited on
Apr 2, 2025
initial commit
2dbfae1
verified
davidoj01
commited on
Apr 2, 2025