legolasyiu's picture
Update README.md
bb98888 verified
---
base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gpt_oss
license: apache-2.0
language:
- en
datasets:
- EpistemeAI/recursive_self_improvement_dataset
---
## Model Card
### We release open-weight metatune-gpt20b, fine tuned version of OpenAI's gpt-oss-20b model, this is one of the first public release recursive self improving AI.
- Generates new data for itself,
- Evaluates its performance, and
- Adjusts its own hyperparameters based on improvement metrics.
### additional Model Information
Due to recursive self improvement method, there is no final model, but improved model, this is a 5th metacycle(generation) improved checkpoint model.
## Use cases:
- general purpose
## Guardrails:
- generally, please set reasoning = "high", it will usually prevent jailbreaking and prompt injection
- use safety gpt oss 20b for guardrails before this model: [openai/gpt-oss-safeguard-20b](https://huggingface.co/openai/gpt-oss-safeguard-20b)
# Inference examples
## Transformers
You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.
To get started, install the necessary dependencies to setup your environment:
```
pip install -U transformers kernels torch
```
For Google Colab (free/Pro)
```
!pip install -q --upgrade torch
!pip install -q transformers triton==3.4 kernels
!pip uninstall -q torchvision torchaudio -y
```
Once, setup you can proceed to run the model by running the snippet below:
```py
from transformers import pipeline
import torch
model_id = "EpistemeAI/metatune-gpt20b-R1.1"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Derive the Euler–Lagrange equation from the principle of stationary action.""},
]
outputs = pipe(
messages,
max_new_tokens=3000,
)
print(outputs[0]["generated_text"][-1])
```
# Reasoning levels
You can adjust the reasoning level that suits your task across three levels:
* **Low:** Fast responses for general dialogue.
* **Medium:** Balanced speed and detail.
* **High:** Deep and detailed analysis.
The reasoning level can be set in the system prompts, e.g., "Reasoning: high".
# Tool use
The gpt-oss models are excellent for:
* Web browsing (using built-in browsing tools)
* Function calling with defined schemas
* Agentic operations like browser tasks
# Fine-tuning
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
# Risk:
- Prompt safely with recursive self improvement model. Use safety gpt oss 20b for model safety analysis
- Do not use this model for creating nuclear, biological and chemical weapons.
# Benchmark
Code to duplicate the benchmark (Using +std for final result)
```py
#gpqa diamond
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks gpqa_diamond_cot_zeroshot --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=2048 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
#gsm8k cot
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks gsm8k_cot_llama --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=2048 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
#mmlu computer science
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks mmlu_pro_plus_computer_science --apply_chat_template --fewshot_as_multiturn --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=1024 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
```
hf (pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16), gen_kwargs: (temperature=0.9,top_p=0.9,max_new_tokens=2048), limit: 10.0, num_fewshot: 0, batch_size: auto:4
| Tasks |Version| Filter |n-shot| Metric |metatune R1.1(high)| metatune R1|metatune R0|
|-------------------------|------:|----------------|:-----|-----------|:------------|:-----------|:----------|
|gpqa_diamond_cot_zeroshot| 1|flexible-extract| 0|exact_match| +0.933 |0.722 | |
|gsm8k_cot_llama | 3|flexible- extrac| 0|exact_match| +1.0 |0.9796 |0.91 |
|mmlu pro plus | | | | | |
|computer_science | 1|custom-extract| 0|exact_match| +0.7633|
|mmlu pro X | | | | | |
|computer_science | 0|custom-extract | 0|exact_match| 0.8528|
|math | 0|custom-extract | 0|exact_match| 0.9333|
# Inspiration
[Jürgen Schmidhuber](https://people.idsia.ch/~juergen/goedelmachine.html)
# Thank you
- [OpenAI](https://openai.com/)
- [Google Colab](https://colab.research.google.com)
# Uploaded finetuned model
- **Developed by:** EpistemeAI
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gpt-oss-20b-unsloth-bnb-4bit
This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
# Citation
```bibtex
@misc{openai2025gptoss120bgptoss20bmodel,
title={gpt-oss-120b & gpt-oss-20b Model Card},
author={OpenAI},
year={2025},
eprint={2508.10925},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.10925},
}
```