trained on 12,312,444,928 tokens from the kjj0/fineweb100B-gpt2 dataset
$ lm_eval --model hf \
--model_args pretrained=michaelbzhu/test-7.6B-base,trust_remote_code=True \
--tasks mmlu_college_medicine,hellaswag,lambada_openai,arc_easy,winogrande,arc_challenge,openbookqa \
--device cuda:0 \
--batch_size 16
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|----------------|------:|------|-----:|----------|---|------:|---|-----:|
|arc_challenge | 1|none | 0|acc |↑ | 0.2295|± |0.0123|
| | |none | 0|acc_norm |↑ | 0.2628|± |0.0129|
|arc_easy | 1|none | 0|acc |↑ | 0.5358|± |0.0102|
| | |none | 0|acc_norm |↑ | 0.4663|± |0.0102|
|hellaswag | 1|none | 0|acc |↑ | 0.3788|± |0.0048|
| | |none | 0|acc_norm |↑ | 0.4801|± |0.0050|
|lambada_openai | 1|none | 0|acc |↑ | 0.4527|± |0.0069|
| | |none | 0|perplexity|↓ |14.3601|± |0.4468|
|college_medicine| 1|none | 0|acc |↑ | 0.2254|± |0.0319|
|openbookqa | 1|none | 0|acc |↑ | 0.1920|± |0.0176|
| | |none | 0|acc_norm |↑ | 0.3020|± |0.0206|
|winogrande | 1|none | 0|acc |↑ | 0.5107|± |0.0140|
- Downloads last month
- 3