Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ansul90
/
hindi-bpe-tokenizer
like
0
Hindi
custom
tokenizer
bpe
hindi
devanagari
byte-pair-encoding
nlp
License:
mit
Model card
Files
Files and versions
xet
Community
main
hindi-bpe-tokenizer
1.53 MB
2 contributors
History:
3 commits
ansul90
Update README.md
dd38b63
verified
about 2 months ago
.gitattributes
1.52 kB
initial commit
about 2 months ago
.gitignore
195 Bytes
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago
README.md
7.28 kB
Update README.md
about 2 months ago
hindi_bpe_tokenizer.py
8.14 kB
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago
hindi_corpus.txt
1.51 MB
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago
pyproject.toml
202 Bytes
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago
train_bpe_simple.py
4.76 kB
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago
training_results.json
2.29 kB
Initial commit: Hindi BPE Tokenizer (without large model file)
about 2 months ago