Update README.md
Browse files
README.md
CHANGED
|
@@ -95,6 +95,17 @@ Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
|
|
| 95 |
- Do not use this model for creating nuclear, biological and chemical weapons.
|
| 96 |
- Do not allow harmful or malicious outputs
|
| 97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
|
| 99 |
## Benchmark
|
| 100 |
hf (pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16), gen_kwargs: (temperature=0.9,top_p=0.9,max_new_tokens=2048), limit: 10.0, num_fewshot: 0, batch_size: auto:4
|
|
|
|
| 95 |
- Do not use this model for creating nuclear, biological and chemical weapons.
|
| 96 |
- Do not allow harmful or malicious outputs
|
| 97 |
|
| 98 |
+
Code to duplicate the benchmark (Using +std for final result)
|
| 99 |
+
```py
|
| 100 |
+
|
| 101 |
+
#gpqa diamond
|
| 102 |
+
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.2,parallelize=True,dtype=bfloat16 --tasks gpqa_diamond_cot_zeroshot --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=2048 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
|
| 103 |
+
#gsm8k cot
|
| 104 |
+
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.2,parallelize=True,dtype=bfloat16 --tasks gsm8k_cot_llama --apply_chat_template --fewshot_as_multiturn --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=1024 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
|
| 105 |
+
#mmlu computer science
|
| 106 |
+
!lm_eval --model hf --model_args pretrained=EpistemeAI/metatune-gpt20b-R1.2,parallelize=True,dtype=bfloat16 --tasks mmlu_pro_plus_computer_science --apply_chat_template --fewshot_as_multiturn --num_fewshot 0 --gen_kwargs temperature=0.9,top_p=0.9,max_new_tokens=1024 --batch_size auto:4 --limit 10 --device cuda:0 --output_path ./eval_harness/gpt-oss-20b3
|
| 107 |
+
|
| 108 |
+
```
|
| 109 |
|
| 110 |
## Benchmark
|
| 111 |
hf (pretrained=EpistemeAI/metatune-gpt20b-R1.1,parallelize=True,dtype=bfloat16), gen_kwargs: (temperature=0.9,top_p=0.9,max_new_tokens=2048), limit: 10.0, num_fewshot: 0, batch_size: auto:4
|