Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
v37
/
latin-tokenizer-32k
like
0
Follow
v37
1
Latin
sentencepiece
tokenizer
bpe
latin
nlp
classical-languages
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
latin-tokenizer-32k
5.21 MB
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
DanieleSalatti
Update tokenizer_config for fast tokenizer (keep auto_map for backwards compat)
e45ad24
verified
4 days ago
.gitattributes
Safe
1.52 kB
initial commit
4 days ago
README.md
3.9 kB
Update README with AutoTokenizer usage
4 days ago
latin_bpe_32000.model
Safe
565 kB
xet
Upload folder using huggingface_hub
4 days ago
latin_bpe_32000.vocab
Safe
520 kB
Upload folder using huggingface_hub
4 days ago
latin_tokenizer.py
Safe
2.32 kB
Add custom LatinTokenizer for AutoTokenizer support
4 days ago
tokenizer.json
4.12 MB
Add tokenizer.json for AutoTokenizer support and download tracking
4 days ago
tokenizer_config.json
314 Bytes
Update tokenizer_config for fast tokenizer (keep auto_map for backwards compat)
4 days ago