Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
v37
/
latin-tokenizer-32k
like
0
Follow
v37
1
Latin
sentencepiece
tokenizer
bpe
latin
nlp
classical-languages
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
latin-tokenizer-32k
5.21 MB
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
DanieleSalatti
Update tokenizer_config for fast tokenizer (keep auto_map for backwards compat)
e45ad24
verified
about 2 months ago
.gitattributes
Safe
1.52 kB
initial commit
about 2 months ago
README.md
3.9 kB
Update README with AutoTokenizer usage
about 2 months ago
latin_bpe_32000.model
565 kB
xet
Upload folder using huggingface_hub
about 2 months ago
latin_bpe_32000.vocab
Safe
520 kB
Upload folder using huggingface_hub
about 2 months ago
latin_tokenizer.py
Safe
2.32 kB
Add custom LatinTokenizer for AutoTokenizer support
about 2 months ago
tokenizer.json
4.12 MB
Add tokenizer.json for AutoTokenizer support and download tracking
about 2 months ago
tokenizer_config.json
314 Bytes
Update tokenizer_config for fast tokenizer (keep auto_map for backwards compat)
about 2 months ago