How to use atharvanighot/hindi-tokenizer with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("atharvanighot/hindi-tokenizer", dtype="auto")
Llama 2 Tokenizer with Additional 25104 Hindi Tokens which makes it much better on Hindi Data
-