pavan01729
/

Grpo_chem_llama1b

Text Generation

text-generation-inference

Model card Files Files and versions

Uploaded model

Developed by: pavan01729
License: apache-2.0
Finetuned from model : pavan01729/my_Llama-3.2-1B-Instruct

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

F16

·

Model tree for pavan01729/Grpo_chem_llama1b

Base model

pavan01729/my_Llama-3.2-1B-Instruct

Finetuned

(1)

this model