GPT-OSS-20B for AI Programming
This model has been fine-tuned to output step-by-step guides or recipes for building complex AI systems. The dataset used for training constists of 900 Q&A pairs built from open source AI firms cookbooks or recipes on how to build systems using their framework, APIs, etc. You can get an idea of the model output style and specialization by looking at the dataset it has been trained on.
This model is a fine-tuned version of openai/gpt-oss-20b. It has been trained using TRL.
Quick start
from transformers import pipeline
question = "How to fine-tune a vision model for instance segmentation of vehicles starting from a pre-trained huggingface baseline?"
generator = pipeline("text-generation", model="paulprt/gpt-oss-20b-ai-programming", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=4096, return_full_text=False)[0]
print(output["generated_text"])
Training details
With a dataset of average token length being relatively high (2131) and an objective of learning specific knowledge, I decided to use a high rank and alpha value. This brought the total number of trainable parameters to 120M.
Here is my LoRa config:
peft_config = LoraConfig(
r=64,
lora_alpha=128,
target_modules="all-linear",
target_parameters=[
"7.mlp.experts.gate_up_proj",
"7.mlp.experts.down_proj",
"15.mlp.experts.gate_up_proj",
"15.mlp.experts.down_proj",
"23.mlp.experts.gate_up_proj",
"23.mlp.experts.down_proj",
],
)
trainable params: 120,324,096 || all params: 21,035,081,280 || trainable%: 0.5720
I used a single H200 SXM GPU and trained the model for 2 epochs.
Framework versions
- TRL: 0.27.0
- Transformers: 4.57.6
- Pytorch: 2.8.0+cu128
- Datasets: 4.5.0
- Tokenizers: 0.22.2
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
Model tree for paulprt/gpt-oss-20b-ai-programming
Base model
openai/gpt-oss-20b
