Fix tokenizer compatibility with newer transformers versions
Browse filesIssue: A community member reported the error: "Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 88 column 3" when trying to load the model with newer versions of the transformers library.
Root cause: The field prepend_scheme: "always" is deprecated and not recognized by the Rust tokenizers backend in newer library versions.
Solution: Updated tokenizer.json to resolve compatibility issues with recent versions of the tokenizers library.
Changes:
Line 84: Changed "prepend_scheme": "always" to "add_prefix_space": true in pre_tokenizer
Line 167: Changed "prepend_scheme": "always" to "add_prefix_space": true in decoder
- tokenizer.json +2 -2
tokenizer.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:687a450aae900a541fd7c85890b174af4bbe6351ca3dbfeb9f3964e57b9401c1
|
| 3 |
+
size 16316221
|