R3po
/

gemma-3-grpo

Text Generation

Model card Files Files and versions

Gemma 3 (1B) — GRPO Reasoning Model

Author: R3po
Institution: EAFIT University
Course: Artificial Intelligence — Workshop #3
Base model: unsloth/gemma-3-1b-it
License: Apache-2.0

This gemma3_text model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for R3po/gemma-3-grpo

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

unsloth/gemma-3-1b-it

Finetuned

(443)

this model