Max Babbelaar — Chat Model
Instruction-tuned chat model for Max Babbelaar, a bilingual (Dutch + English) character modelled on a 19th-century Dutch gentleman. Fine-tuned from fdeantoni/max-babbelaar-base on curated persona, historical Q&A, and Delpher newspaper search-call SFT examples. Training corpus: fdeantoni/max-babbelaar-corpus.
This repo holds SFT checkpoints for multiple model depths. Each tag (d18, d24, …)
lives under chatsft_checkpoints/<tag>/ and shares a single tokenizer.
Latest upload: d18 at step 1750 (val_bpb 0.2652)
| Depth | Step | Layers | d_model | Heads (Q/KV) | Vocab | Context |
|---|---|---|---|---|---|---|
d18 |
1750 | 18 | 1152 | 9/9 | 32768 | 2048 |
Architecture: GPT with RoPE, QK-norm, GQA, relu² MLP, sliding-window pattern SSSL,
value embeddings (ResFormer-style), smear gate, and backout residual. Trained with the
nanochat fork.
Repo layout
chatsft_checkpoints/
<tag>/
model_<step>.pt — SFT weights (torch state dict, bf16)
meta_<step>.json — GPTConfig + training metadata including val_bpb
tokenizer/
tokenizer.pkl — tiktoken BPE encoding (vocab 32768, rustbpe-trained)
token_bytes.pt — per-token byte tensors
Download and run
from huggingface_hub import snapshot_download
import os
snapshot_download(
repo_id="fdeantoni/max-babbelaar-chat",
repo_type="model",
allow_patterns=["chatsft_checkpoints/d18/**", "tokenizer/**"],
local_dir=os.path.expanduser("~/.cache/nanochat"),
local_dir_use_symlinks=False,
)
Then start the chat server or CLI:
export NANOCHAT_BASE_DIR=~/.cache/nanochat
cd nanochat
# Web interface
uv run python -m scripts.chat_web -g d18
# Command-line chat
uv run python -m scripts.chat_cli
Chat format
Conversations use nanochat's special tokens. Max always responds in the language of the question (Dutch or English), anchored to 1 January 1880.
<|bos|>
<|user_start|>Wie was Napoleon Bonaparte?<|user_end|>
<|assistant_start|>Napoleon Bonaparte was...<|assistant_end|>
<|user_start|>Next question<|user_end|>
<|assistant_start|>
The model supports a search tool for Delpher newspaper lookups, invoked via
<|python_start|>search(...)<|python_end|> inside the assistant turn.
Tokenizer
Custom GPT-4-style BPE tokenizer with vocab size 32768, trained on the Babbelaar corpus.
Special tokens: <|bos|> <|user_start|> <|user_end|> <|assistant_start|> <|assistant_end|>
<|python_start|> <|python_end|> <|output_start|> <|output_end|>.
Stored as a tiktoken pickle at tokenizer/tokenizer.pkl. Load within the nanochat project with:
from nanochat.tokenizer import get_tokenizer # reads NANOCHAT_BASE_DIR/tokenizer/tokenizer.pkl
tokenizer = get_tokenizer()
Model tree for fdeantoni/max-babbelaar-chat
Base model
fdeantoni/max-babbelaar-base