KokosDev
/

checkpoint

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DeepSeek R1 Checkpoint 4

This is checkpoint 4 from training the DeepSeek R1 model on mathematical reasoning tasks using GRPO.

Model: deepseek-ai/deepseek-r1-distill-qwen-7b
Checkpoint Step: 4
Training Details: Trained on 8 NVIDIA B200 GPUs with batch size 6 per GPU, 6 epochs.

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support