Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Nexuss0781
/
Ethio-BBPE
like
0
Nexuss0781/synaxarium
Nexuss0781/conon-biblical-am-en
Amharic
tokenizers
amharic
geez
ethiopic
biblical-texts
synaxarium
byte-level-bpe
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
Ethio-BBPE
2.89 MB
Ctrl+K
Ctrl+K
1 contributor
History:
61 commits
Nexuss0781
Upload folder using huggingface_hub
facc1cb
verified
about 1 month ago
models
Delete models/demo_tokenizer/vocab.json with huggingface_hub
about 1 month ago
scripts
Delete scripts/prepare_datasets.py with huggingface_hub
about 1 month ago
.gitattributes
Safe
1.58 kB
Upload training_corpus.txt with huggingface_hub
about 1 month ago
.gitignore
Safe
507 Bytes
Upload .gitignore with huggingface_hub
about 1 month ago
LICENSE
Safe
1.09 kB
feat: Initial release of EthioBBPE - Ethiopian Language Tokenizer
about 1 month ago
README.md
8.63 kB
Upload README.md with huggingface_hub
about 1 month ago
config.json
Safe
431 Bytes
Production release: EthioBBPE tokenizer with perfect Amharic reconstruction
about 1 month ago
merges.txt
Safe
2.11 kB
Upload merges.txt with huggingface_hub
about 1 month ago
pyproject.toml
Safe
2.59 kB
Upload pyproject.toml with huggingface_hub
about 1 month ago
requirements.txt
Safe
19 Bytes
feat: Initial release of EthioBBPE - Ethiopian Language Tokenizer
about 1 month ago
special_tokens_map.json
Safe
95 Bytes
Upload special_tokens_map.json with huggingface_hub
about 1 month ago
tokenizer.json
Safe
1.34 MB
Production release: EthioBBPE tokenizer with perfect Amharic reconstruction
about 1 month ago
training_metrics.json
Safe
1.34 kB
Production release: EthioBBPE tokenizer with perfect Amharic reconstruction
about 1 month ago
vocab.json.gz
138 kB
xet
Upload vocab.json.gz with huggingface_hub
about 1 month ago