KokosDev
/

checkpoint

Model card Files Files and versions

checkpoint / README.md

KokosDev's picture

Add model card for checkpoint 4

e741c65 verified 9 months ago

|

history blame contribute delete

299 Bytes

	# DeepSeek R1 Checkpoint 4

	This is checkpoint 4 from training the DeepSeek R1 model on mathematical reasoning tasks using GRPO.

	- Model: deepseek-ai/deepseek-r1-distill-qwen-7b
	- Checkpoint Step: 4
	- Training Details: Trained on 8 NVIDIA B200 GPUs with batch size 6 per GPU, 6 epochs.