R2EGym-7B-Agent-Coder-Instruct (checkpoint-800)

This repository contains a training checkpoint exported from LLaMA-Factory.

  • Base: Qwen/Qwen2.5-Coder-7B-Instruct
  • Training: SFT with DeepSpeed ZeRO-3
  • Checkpoint: checkpoint-800

Notes

  • This repo includes ZeRO optimizer states in global_step800/ for resuming training.
  • For inference, use the model-0000*-of-00004.safetensors shards and tokenizer files.
Downloads last month
18
Safetensors
Model size
333k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support