Update tokenization_chatglm.py

Based on the [documentation](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.PreTrainedTokenizer.decode) and reference implementation of the Hugging Face tokenizer, the `decode` method should accept both a single integer or an empty list as input.

This simple modification would make ChatGLM-6B compatible with inference frameworks such as [Basaran](https://github.com/hyperonym/basaran).

Files changed (1) hide show

tokenization_chatglm.py +4 -0

tokenization_chatglm.py CHANGED Viewed

@@ -264,6 +264,10 @@ class ChatGLMTokenizer(PreTrainedTokenizer):
             spaces_between_special_tokens: bool = True,
             **kwargs
     ) -> str:
         if isinstance(token_ids[0], list):
             tokens = []
             for single_token_ids in token_ids:

             spaces_between_special_tokens: bool = True,
             **kwargs
     ) -> str:
+        if not isinstance(token_ids, list):
+            token_ids = [token_ids]
+        if len(token_ids) == 0:
+            return ""
         if isinstance(token_ids[0], list):
             tokens = []
             for single_token_ids in token_ids: