GPT-OSS-20B for AI Programming

image

This model has been fine-tuned to output step-by-step guides or recipes for building complex AI systems. The dataset used for training constists of 900 Q&A pairs built from open source AI firms cookbooks or recipes on how to build systems using their framework, APIs, etc. You can get an idea of the model output style and specialization by looking at the dataset it has been trained on.

This model is a fine-tuned version of openai/gpt-oss-20b. It has been trained using TRL.

Quick start

from transformers import pipeline

question = "How to fine-tune a vision model for instance segmentation of vehicles starting from a pre-trained huggingface baseline?"
generator = pipeline("text-generation", model="paulprt/gpt-oss-20b-ai-programming", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=4096, return_full_text=False)[0]
print(output["generated_text"])

Training details

With a dataset of average token length being relatively high (2131) and an objective of learning specific knowledge, I decided to use a high rank and alpha value. This brought the total number of trainable parameters to 120M.

Here is my LoRa config:

peft_config = LoraConfig(
    r=64,
    lora_alpha=128,
    target_modules="all-linear",
    target_parameters=[
        "7.mlp.experts.gate_up_proj",
        "7.mlp.experts.down_proj",
        "15.mlp.experts.gate_up_proj",
        "15.mlp.experts.down_proj",
        "23.mlp.experts.gate_up_proj",
        "23.mlp.experts.down_proj",
    ],
)
trainable params: 120,324,096 || all params: 21,035,081,280 || trainable%: 0.5720

I used a single H200 SXM GPU and trained the model for 2 epochs.

image

Framework versions

  • TRL: 0.27.0
  • Transformers: 4.57.6
  • Pytorch: 2.8.0+cu128
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citations

Cite TRL as:

@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for paulprt/gpt-oss-20b-ai-programming

Base model

openai/gpt-oss-20b
Finetuned
(456)
this model

Dataset used to train paulprt/gpt-oss-20b-ai-programming