Model Specs
The model configuration (derived from meta_002849.json) is as follows:
| Parameter | Value |
|---|---|
| Layers | 18 |
| Embedding Dim | 1152 |
| Heads (Q/KV) | 9 / 9 (GQA) |
| Vocab Size | 65,536 |
| Max Seq Len | 2048 |
| Window Pattern | SSSL |
| Metric | BASE | SFT | RL |
|---|---|---|---|
| CORE | 0.2500 | - | - |
| ARC-Challenge | - | 0.3942 | - |
| ARC-Easy | - | 0.4722 | - |
| GSM8K | - | 0.0387 | - |
| HumanEval | - | 0.0915 | - |
| MMLU | - | 0.3418 | - |
| ChatCORE | - | 0.2882 | - |
Quick Start
This model uses a custom architecture and cannot be loaded directly via the standard transformers library. Please use the source code from the official GitHub repository.
git clone https://github.com/DestineG/nanochat.git
cd nanochat
git checkout v1.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support