JeanKaddour/minipile
Viewer • Updated • 1.01M • 3.83k • 144
How to use BEE-spoke-data/MiniTokenizer-20480 with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("BEE-spoke-data/MiniTokenizer-20480", dtype="auto")This is a ByteLevelBPETokenizer trained on the JeanKaddour/minipile dataset with the aim to create a compact English-only tokenizer.
load with AutoTokenizer, i.e.:
from transformers import AutoTokenizer
tk = AutoTokenizer.from_pretrained('BEE-spoke-data/MiniTokenizer-20480')
tk