"""Tokenizer training and validation helpers."""