Fix tokenizer: trim to 2000 vocab to match trained model db0e784 verified LisaMegaWatts commited on 5 days ago