Max Babbelaar — Chat Model

Instruction-tuned chat model for Max Babbelaar, a bilingual (Dutch + English) character modelled on a 19th-century Dutch gentleman. Fine-tuned from fdeantoni/max-babbelaar-base on curated persona, historical Q&A, and Delpher newspaper search-call SFT examples. Training corpus: fdeantoni/max-babbelaar-corpus.

This repo holds SFT checkpoints for multiple model depths. Each tag (d18, d24, …) lives under chatsft_checkpoints/<tag>/ and shares a single tokenizer.

Latest upload: d18 at step 1750 (val_bpb 0.2652)

Depth Step Layers d_model Heads (Q/KV) Vocab Context
d18 1750 18 1152 9/9 32768 2048

Architecture: GPT with RoPE, QK-norm, GQA, relu² MLP, sliding-window pattern SSSL, value embeddings (ResFormer-style), smear gate, and backout residual. Trained with the nanochat fork.

Repo layout

chatsft_checkpoints/
  <tag>/
    model_<step>.pt     — SFT weights (torch state dict, bf16)
    meta_<step>.json    — GPTConfig + training metadata including val_bpb
tokenizer/
  tokenizer.pkl         — tiktoken BPE encoding (vocab 32768, rustbpe-trained)
  token_bytes.pt        — per-token byte tensors

Download and run

from huggingface_hub import snapshot_download
import os

snapshot_download(
    repo_id="fdeantoni/max-babbelaar-chat",
    repo_type="model",
    allow_patterns=["chatsft_checkpoints/d18/**", "tokenizer/**"],
    local_dir=os.path.expanduser("~/.cache/nanochat"),
    local_dir_use_symlinks=False,
)

Then start the chat server or CLI:

export NANOCHAT_BASE_DIR=~/.cache/nanochat
cd nanochat

# Web interface
uv run python -m scripts.chat_web -g d18

# Command-line chat
uv run python -m scripts.chat_cli

Chat format

Conversations use nanochat's special tokens. Max always responds in the language of the question (Dutch or English), anchored to 1 January 1880.

<|bos|>
<|user_start|>Wie was Napoleon Bonaparte?<|user_end|>
<|assistant_start|>Napoleon Bonaparte was...<|assistant_end|>
<|user_start|>Next question<|user_end|>
<|assistant_start|>

The model supports a search tool for Delpher newspaper lookups, invoked via <|python_start|>search(...)<|python_end|> inside the assistant turn.

Tokenizer

Custom GPT-4-style BPE tokenizer with vocab size 32768, trained on the Babbelaar corpus. Special tokens: <|bos|> <|user_start|> <|user_end|> <|assistant_start|> <|assistant_end|> <|python_start|> <|python_end|> <|output_start|> <|output_end|>.

Stored as a tiktoken pickle at tokenizer/tokenizer.pkl. Load within the nanochat project with:

from nanochat.tokenizer import get_tokenizer  # reads NANOCHAT_BASE_DIR/tokenizer/tokenizer.pkl
tokenizer = get_tokenizer()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fdeantoni/max-babbelaar-chat

Finetuned
(1)
this model

Datasets used to train fdeantoni/max-babbelaar-chat

Collection including fdeantoni/max-babbelaar-chat