Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,7 @@ tags: ["gemma","chatml"]
|
|
| 7 |
|
| 8 |
This repository includes a fast tokenizer for [google/gemma-7b](https://huggingface.co/google/gemma-7b) with the ChatML format. The Tokenizer was created by replacing the string values of original tokens with id `106` (`<start_of_turn>`) and `107` (`<end_of_turn>`) with the chatML tokens `<|im_start|>` and `<|im_end|>`.
|
| 9 |
|
| 10 |
-
No
|
| 11 |
|
| 12 |
```python
|
| 13 |
from transformers import AutoTokenizer
|
|
|
|
| 7 |
|
| 8 |
This repository includes a fast tokenizer for [google/gemma-7b](https://huggingface.co/google/gemma-7b) with the ChatML format. The Tokenizer was created by replacing the string values of original tokens with id `106` (`<start_of_turn>`) and `107` (`<end_of_turn>`) with the chatML tokens `<|im_start|>` and `<|im_end|>`.
|
| 9 |
|
| 10 |
+
No new tokens were added during that process to ensure that the original model's embedding doesn't need to be modified.
|
| 11 |
|
| 12 |
```python
|
| 13 |
from transformers import AutoTokenizer
|