ttj
/

nanochat-cache

Model card Files Files and versions

nanochat-cache / report /base-model-evaluation.md

ttj's picture

Add files using upload-large-folder tool

85a524c verified 4 months ago

|

history blame contribute delete

648 Bytes

	## Base model evaluation
	timestamp: 2025-11-03 09:13:28

	- Model: base_model (step 21400)
	- CORE metric: 0.2137
	- hellaswag_zeroshot: 0.2687
	- jeopardy: 0.1214
	- bigbench_qa_wikidata: 0.5278
	- arc_easy: 0.5314
	- arc_challenge: 0.1251
	- copa: 0.3600
	- commonsense_qa: 0.1145
	- piqa: 0.3917
	- openbook_qa: 0.1360
	- lambada_openai: 0.3549
	- hellaswag: 0.2634
	- winograd: 0.2601
	- winogrande: 0.1018
	- bigbench_dyck_languages: 0.1080
	- agi_eval_lsat_ar: 0.1359
	- bigbench_cs_algorithms: 0.3720
	- bigbench_operators: 0.1429
	- bigbench_repeat_copy_logic: 0.0000
	- squad: 0.2528
	- coqa: 0.1932
	- boolq: -0.2369
	- bigbench_language_identification: 0.1762