Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

almaghrabima
/
deeplatent-tokenizer-parity

tokenizer
bpe
myte
sarf
parity-aware
deeplatent
bilingual
arabic-english
morfessor
Model card Files Files and versions
xet
Community
deeplatent-tokenizer-parity / training
23.5 GB
  • 1 contributor
History: 5 commits
almaghrabima's picture
almaghrabima
Upload parity-aware MYTE tokenizer artifacts
a879a93 verified 5 days ago
  • merges.txt
    1.29 MB
    Upload parity-aware MYTE tokenizer artifacts 5 days ago
  • phase1_stats.json
    217 Bytes
    Upload parity-aware MYTE tokenizer artifacts 5 days ago
  • train.ar
    13.7 GB
    xet
    Upload parity-aware MYTE tokenizer artifacts 5 days ago
  • train.en
    9.74 GB
    xet
    Upload parity-aware MYTE tokenizer artifacts 5 days ago