atomwalk12
/

LinalgZero-GRPO-merged

Text Generation

text-generation-inference

Model card Files Files and versions

Model Card for LinalgZero-GSPO

Information and code used to train this model is available on Github.

This model is a fine-tuned version of atomwalk12/LinalgZero-SFT on the atomwalk12/linalgzero-grpo dataset using the GSPO algorithm. It has been trained using ART.

Downloads last month: 55

Safetensors

Model size

3B params

Tensor type

BF16

·

Model tree for atomwalk12/LinalgZero-GRPO-merged

Base model

atomwalk12/LinalgZero-SFT

Adapter

(3)

this model

Adapters

Space using atomwalk12/LinalgZero-GRPO-merged 1