ttj
/

nanochat-cache

Model card Files Files and versions

nanochat-cache / report /base-model-evaluation.md

ttj's picture

Add files using upload-large-folder tool

85a524c verified 4 months ago

|

history blame contribute delete

648 Bytes

Base model evaluation

timestamp: 2025-11-03 09:13:28

Model: base_model (step 21400)
CORE metric: 0.2137
hellaswag_zeroshot: 0.2687
jeopardy: 0.1214
bigbench_qa_wikidata: 0.5278
arc_easy: 0.5314
arc_challenge: 0.1251
copa: 0.3600
commonsense_qa: 0.1145
piqa: 0.3917
openbook_qa: 0.1360
lambada_openai: 0.3549
hellaswag: 0.2634
winograd: 0.2601
winogrande: 0.1018
bigbench_dyck_languages: 0.1080
agi_eval_lsat_ar: 0.1359
bigbench_cs_algorithms: 0.3720
bigbench_operators: 0.1429
bigbench_repeat_copy_logic: 0.0000
squad: 0.2528
coqa: 0.1932
boolq: -0.2369
bigbench_language_identification: 0.1762