Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,10 @@
|
|
| 1 |
---
|
| 2 |
license: llama2
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: llama2
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
This is a slightly modified versions of the original [WizardLM/WizardLM-13B-V1.2 checkpoint](https://huggingface.co/WizardLM/WizardLM-13B-V1.2) that fixes a few bugs:
|
| 6 |
+
|
| 7 |
+
* In the original checkpoint, the BOS token is set to the EOS token (`</s>`, token ID 2). In this version, the BOS is reverted to `<s>` (token ID 1).
|
| 8 |
+
* The original has a mismatch between the size of the tokenizer vocab and the model embedding vocab. This is because the tokenizer includes an extra token for the added `[PAD]` token, making the vocab 32,001 tokens. This discrepancy can cause index errors. This version simply removes the added `[PAD]` in favor of using the `<unk>` (token ID 0) for padding. So the tokenizer's vocab is reverted back to a size of 32,000 to match the model's vocab size.
|
| 9 |
+
|
| 10 |
+
For all other information about this model, refer to the original [WizardLM/WizardLM-13B-V1.2 checkpoint](https://huggingface.co/WizardLM/WizardLM-13B-V1.2).
|