based on phi-3 tokenizer, expanded 17291 tokens
Following The Optimal Vocabulary Size Predictor, I recommend using this tokenizer with 3-4B model such as phi-3-mini
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("devngho/phi3-jamo-ko-tokenizer", dtype="auto")