| library_name: transformers | |
| license: mit | |
| datasets: | |
| - JeanKaddour/minipile | |
| language: | |
| - en | |
| # BEE-spoke-data/MiniTokenizer-20480 | |
| This is a `ByteLevelBPETokenizer` trained on the `JeanKaddour/minipile` dataset with the aim to create a compact English-only tokenizer. | |
| ## Usage | |
| load with AutoTokenizer, i.e.: | |
| ```py | |
| from transformers import AutoTokenizer | |
| tk = AutoTokenizer.from_pretrained('BEE-spoke-data/MiniTokenizer-20480') | |
| tk | |
| ``` | |