Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Pierizvi
/
infused-reasoning-phi2
like
0
Text Generation
PEFT
Safetensors
gsm8k
English
reasoning
mathematics
grpo
reinforcement-learning
phi-2
step-by-step
mathematical-reasoning
rlhf
License:
mit
Model card
Files
Files and versions
xet
Community
Use this model
main
infused-reasoning-phi2
/
training_info.json
Pierizvi
epoch-639
c9d92fa
verified
11 months ago
raw
Copy download link
history
blame
contribute
delete
91 Bytes
{
"step"
:
639
,
"best_reward"
:
1.008333444595337
,
"timestamp"
:
"2025-05-06 19:35:56"
}