karpathy
/

nanochat-d32

Model card Files Files and versions

xet

Community

burtenshaw HF Staff commited on Oct 16, 2025

Commit

1f6d2f6

verified ·

1 Parent(s): 016dba0

add evaluations results as metadata to model card

Browse files

Files changed (1) hide show

README.md +94 -1

README.md CHANGED Viewed

@@ -1,5 +1,98 @@
 ---
 license: mit
 ---
 The nanochat-d32 model described in detail [here](https://github.com/karpathy/nanochat/discussions/8).
@@ -9,4 +102,4 @@ I'm sorry this is a janky upload but you have to place these files correctly on
   - the token_bytes.pt, tokenizer.pkl have to go into ~/.cache/nanochat/tokenizer directory
   - the meta_000650.json, model_000650.pt have to go into ~/.cache/nanochat/chatsft_checkpoints/d32/
-I'll figure out how to make this less janky in the future, and to make nanochat play nicer with huggingface infra.

 ---
 license: mit
+datasets:
+- karpathy/fineweb-edu-100b-shuffle
+model-index:
+- name: chat-d10
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+    metrics:
+    - type: acc_norm
+      value: 49.91
+      name: normalized accuracy
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Easy
+      split: test
+    metrics:
+    - type: acc_norm
+      value: 67.97
+      name: normalized accuracy
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+    metrics:
+    - type: acc
+      value: 40.49
+      name: accuracy
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+    metrics:
+    - type: acc
+      value: 12.74
+      name: accuracy
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HumanEval
+      type: openai_humaneval
+      split: test
+    metrics:
+    - type: pass@1
+      value: 12.8
+      name: pass@1
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: ChatCORE
+      type: chatcore
+      split: test
+    metrics:
+    - type: score
+      value: 27.34
+      name: ChatCORE metric
+    source:
+      url: https://github.com/karpathy/nanochat/discussions/8
+      name: nanochat
 ---
 The nanochat-d32 model described in detail [here](https://github.com/karpathy/nanochat/discussions/8).
   - the token_bytes.pt, tokenizer.pkl have to go into ~/.cache/nanochat/tokenizer directory
   - the meta_000650.json, model_000650.pt have to go into ~/.cache/nanochat/chatsft_checkpoints/d32/
+I'll figure out how to make this less janky in the future, and to make nanochat play nicer with huggingface infra.