michaelbzhu
/

test-7.6B-base

Text Generation

Model card Files Files and versions

test-7.6B-base / README.md

michaelbzhu's picture

Update README.md

2bd5681 verified 4 months ago

|

history blame contribute delete

1.56 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- kjj0/fineweb100B-gpt2
	---

	trained on 12,312,444,928 tokens from the [kjj0/fineweb100B-gpt2](https://huggingface.co/datasets/kjj0/fineweb100B-gpt2) dataset

	```
	$ lm_eval --model hf \
	--model_args pretrained=michaelbzhu/test-7.6B-base,trust_remote_code=True \
	--tasks mmlu_college_medicine,hellaswag,lambada_openai,arc_easy,winogrande,arc_challenge,openbookqa \
	--device cuda:0 \
	--batch_size 16

	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \| Value \| \|Stderr\|
	\|----------------\|------:\|------\|-----:\|----------\|---\|------:\|---\|-----:\|
	\|arc_challenge \| 1\|none \| 0\|acc \|↑ \| 0.2295\|± \|0.0123\|
	\| \| \|none \| 0\|acc_norm \|↑ \| 0.2628\|± \|0.0129\|
	\|arc_easy \| 1\|none \| 0\|acc \|↑ \| 0.5358\|± \|0.0102\|
	\| \| \|none \| 0\|acc_norm \|↑ \| 0.4663\|± \|0.0102\|
	\|hellaswag \| 1\|none \| 0\|acc \|↑ \| 0.3788\|± \|0.0048\|
	\| \| \|none \| 0\|acc_norm \|↑ \| 0.4801\|± \|0.0050\|
	\|lambada_openai \| 1\|none \| 0\|acc \|↑ \| 0.4527\|± \|0.0069\|
	\| \| \|none \| 0\|perplexity\|↓ \|14.3601\|± \|0.4468\|
	\|college_medicine\| 1\|none \| 0\|acc \|↑ \| 0.2254\|± \|0.0319\|
	\|openbookqa \| 1\|none \| 0\|acc \|↑ \| 0.1920\|± \|0.0176\|
	\| \| \|none \| 0\|acc_norm \|↑ \| 0.3020\|± \|0.0206\|
	\|winogrande \| 1\|none \| 0\|acc \|↑ \| 0.5107\|± \|0.0140\|
	```