Uploaded model

  • Developed by: regulus4869
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-0.5b-instruct-unsloth-bnb-4bit

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
2
Safetensors
Model size
0.5B params
Tensor type
F16
·
Inference Providers NEW
Input a message to start chatting with regulus4869/ppo_trained_model_gsm8k_ppo_500examples.