Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

ansul90
/
hindi-bpe-tokenizer

Hindi
custom
tokenizer
bpe
hindi
devanagari
byte-pair-encoding
nlp
Model card Files Files and versions
xet
Community
hindi-bpe-tokenizer
1.53 MB
  • 2 contributors
History: 3 commits
ansul90's picture
ansul90
Update README.md
dd38b63 verified about 2 months ago
  • .gitattributes
    1.52 kB
    initial commit about 2 months ago
  • .gitignore
    195 Bytes
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago
  • README.md
    7.28 kB
    Update README.md about 2 months ago
  • hindi_bpe_tokenizer.py
    8.14 kB
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago
  • hindi_corpus.txt
    1.51 MB
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago
  • pyproject.toml
    202 Bytes
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago
  • train_bpe_simple.py
    4.76 kB
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago
  • training_results.json
    2.29 kB
    Initial commit: Hindi BPE Tokenizer (without large model file) about 2 months ago