SindhiLM-Tokenizer-v2: morpheme-aware BPE with SindhiNLTK pre-segmentation, fixed noise filter, byte-ghost reduction
f75c2c6 verified - SHA256:
- 3fd169731d2cbde95e10bf356d66d5997fd885dd8dbb6fb4684da3f23b2585d8
- Pointer size:
- 133 Bytes
- Size of remote file:
- 11.4 MB
- Xet hash:
- d3f835122bddb470f53048ff36f1a5116791b8f3a003f9d17b3b03b0b81cc5fe
·
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.