Update README.md
Browse files
README.md
CHANGED
|
@@ -106,7 +106,8 @@ The model achieved convergence at a perplexity of 1.14, demonstrating strong lan
|
|
| 106 |
|
| 107 |
- **Type**: Byte-Pair Encoding (BPE)
|
| 108 |
- **Vocabulary Size**: 32,000 tokens
|
| 109 |
-
- **Special Tokens**: Includes `<UNK>`, `<PAD>`, `<BOS>`, `<EOS>`,
|
|
|
|
| 110 |
- **Pre-tokenizer**: ByteLevel encoding
|
| 111 |
|
| 112 |
## Intended Use
|
|
|
|
| 106 |
|
| 107 |
- **Type**: Byte-Pair Encoding (BPE)
|
| 108 |
- **Vocabulary Size**: 32,000 tokens
|
| 109 |
+
- **Special Tokens**: Includes `<UNK>`, `<PAD>`, `<BOS>`, `<EOS>`, `<|im_start|>`, `<|im_end|>`, `<|system|>`, `<|user|>`, `<|assistant|>`, `<|endoftext|>`, `<|eot_id|>`, `[INST]`, `[/INST]`
|
| 110 |
+
|
| 111 |
- **Pre-tokenizer**: ByteLevel encoding
|
| 112 |
|
| 113 |
## Intended Use
|