Alonadoli commited on
Commit
95d2e46
·
verified ·
1 Parent(s): 44b916e

Fix tokenizer compatibility with newer transformers versions

Browse files

Issue: A community member reported the error: "Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 88 column 3" when trying to load the model with newer versions of the transformers library.

Root cause: The field prepend_scheme: "always" is deprecated and not recognized by the Rust tokenizers backend in newer library versions.

Solution: Updated tokenizer.json to resolve compatibility issues with recent versions of the tokenizers library.

Changes:
Line 84: Changed "prepend_scheme": "always" to "add_prefix_space": true in pre_tokenizer
Line 167: Changed "prepend_scheme": "always" to "add_prefix_space": true in decoder

Files changed (1) hide show
  1. tokenizer.json +2 -2
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ac9bd80d04d21b4df95917bdf7c750cbea28acc30a4462c76ca2b9c86d371863
3
- size 16316225
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:687a450aae900a541fd7c85890b174af4bbe6351ca3dbfeb9f3964e57b9401c1
3
+ size 16316221