Final Rescue: Full vocab restoration with Unigram scores and Metaspace fix 0d52c92 verified suchirsalhan commited on 6 days ago
Final Fix: Correct Metaspace mapping and Unigram scores 79f5510 verified suchirsalhan commited on 6 days ago
Fix: Final Metaspace decoding using exact vocab marker dd7139b verified suchirsalhan commited on 6 days ago
Fix: Robust SPM-to-HF conversion with merges and byte-fallback 9fb11b8 verified suchirsalhan commited on 6 days ago
Fix: Final Metaspace decoding using exact vocab marker dacd2f0 verified suchirsalhan commited on 6 days ago
Fix: Used Llama-style identity wrapping to preserve SPM IDs and fix spacing bb8d4b5 verified suchirsalhan commited on 6 days ago
Fix: Final Metaspace decoding using exact vocab marker ff01214 verified suchirsalhan commited on 6 days ago
Fix: Reverted to native SentencePiece handling (removed ByteLevel mismatch) 6c75610 verified suchirsalhan commited on 6 days ago
Fix: Applied ByteLevel pre-tokenization and decoding for proper spacing/hex handling 837e60a verified suchirsalhan commited on 6 days ago
Fix: Clean base tokenizer using universal metaspace decoding 15e1bd6 verified suchirsalhan commited on 6 days ago
Fix: Universal Metaspace decoding without keyword conflicts a29d3e1 verified suchirsalhan commited on 6 days ago
Fix: Manual vocab extraction from spm.model pieces 425507b verified suchirsalhan commited on 6 days ago
Fix: Environment issues resolved, full Fast Tokenizer uploaded. 3d13f8f verified suchirsalhan commited on 6 days ago