File size: 6,092 Bytes
43834cc
 
 
 
 
 
 
 
 
 
30b5852
 
43834cc
 
a4e35f5
 
 
 
 
 
8189657
524d712
a3ae33d
a4e35f5
0beca4b
a4e35f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f6fd1ba
 
 
 
a4e35f5
a4c4e1c
e4d5e7e
 
e44f867
e4d5e7e
556b050
e4d5e7e
 
0f4ed5e
 
e4d5e7e
e44f867
a4c4e1c
 
918469b
 
477b871
 
49d1ac8
9fd842e
49d1ac8
918469b
 
d2187d0
20de9e4
c7f7c34
d2187d0
32b62e3
 
 
 
d2187d0
43834cc
 
 
 
 
 
 
 
bb98888
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
---
base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gpt_oss
license: apache-2.0
language:
- en
datasets:
- EpistemeAI/recursive_self_improvement_dataset
---

## Model Card
### We release open-weight metatune-gpt20b, fine tuned version of OpenAI's gpt-oss-20b model,  this is one of the first public release recursive self improving AI.
- Generates new data for itself,
- Evaluates its performance, and
- Adjusts its own hyperparameters based on improvement metrics.

### additional Model Information
Due to recursive self improvement method, there is no final model, but improved model, this is a 5th metacycle(generation) improved checkpoint model.

## Use cases: 
- general purpose 

## Guardrails:
- generally, please set reasoning = "high", it will usually prevent jailbreaking and prompt injection
- use safety gpt oss 20b for guardrails before this model:  [openai/gpt-oss-safeguard-20b](https://huggingface.co/openai/gpt-oss-safeguard-20b)

# Inference examples

## Transformers

You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.

To get started, install the necessary dependencies to setup your environment:

```
pip install -U transformers kernels torch 
```

For Google Colab (free/Pro)
```
!pip install -q --upgrade torch

!pip install -q transformers triton==3.4 kernels

!pip uninstall -q torchvision torchaudio -y
```

Once, setup you can proceed to run the model by running the snippet below:

```py
from transformers import pipeline
import torch
model_id = "EpistemeAI/metatune-gpt20b-R1.1"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)
messages = [
    {"role": "user", "content": "Derive the Euler–Lagrange equation from the principle of stationary action.""},
]
outputs = pipe(
    messages,
    max_new_tokens=3000,
)
print(outputs[0]["generated_text"][-1])
```
# Reasoning levels

You can adjust the reasoning level that suits your task across three levels:

* **Low:** Fast responses for general dialogue.  
* **Medium:** Balanced speed and detail.  
* **High:** Deep and detailed analysis.

The reasoning level can be set in the system prompts, e.g., "Reasoning: high".

# Tool use

The gpt-oss models are excellent for:
* Web browsing (using built-in browsing tools)
* Function calling with defined schemas
* Agentic operations like browser tasks

# Fine-tuning

Both gpt-oss models can be fine-tuned for a variety of specialized use cases.


# Risk:
- Prompt safely with recursive self improvement model. Use safety gpt oss 20b for model safety analysis
- Do not use this model for creating nuclear, biological and chemical weapons.

# Benchmark

Code to duplicate the benchmark (Using +std for final result)
```py
#gpqa diamond
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks gpqa_diamond_cot_zeroshot  --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=2048 --batch_size auto:4 --limit 10  --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
#gsm8k cot
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks gsm8k_cot_llama  --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=2048 --batch_size auto:4 --limit 10  --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
#mmlu computer science
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16 --tasks mmlu_pro_plus_computer_science --apply_chat_template --fewshot_as_multiturn  --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=1024 --batch_size auto:4 --limit 10  --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3

```

hf (pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16), gen_kwargs: (temperature=0.9,top_p=0.9,max_new_tokens=2048), limit: 10.0, num_fewshot: 0, batch_size: auto:4
|          Tasks          |Version|     Filter     |n-shot|  Metric   |metatune R1.1(high)| metatune R1|metatune R0| 
|-------------------------|------:|----------------|:-----|-----------|:------------|:-----------|:----------|
|gpqa_diamond_cot_zeroshot|      1|flexible-extract|     0|exact_match|  +0.933     |0.722       |           |
|gsm8k_cot_llama          |      3|flexible- extrac|     0|exact_match|  +1.0       |0.9796      |0.91       |
|mmlu pro plus            |       |                |      |           |         |
|computer_science         |      1|custom-extract|       0|exact_match|  +0.7633|
|mmlu pro X               |       |                |      |           |         |
|computer_science         |      0|custom-extract  |     0|exact_match|  0.8528|
|math                     |      0|custom-extract  |     0|exact_match|  0.9333|

# Inspiration
[Jürgen Schmidhuber](https://people.idsia.ch/~juergen/goedelmachine.html)

# Thank you
- [OpenAI](https://openai.com/)
- [Google Colab](https://colab.research.google.com)


# Uploaded finetuned  model

- **Developed by:** EpistemeAI
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gpt-oss-20b-unsloth-bnb-4bit

This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

# Citation

```bibtex
@misc{openai2025gptoss120bgptoss20bmodel,
      title={gpt-oss-120b & gpt-oss-20b Model Card}, 
      author={OpenAI},
      year={2025},
      eprint={2508.10925},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.10925}, 
}
```