instruction-pretrain
/

instruction-synthesizer

Text Generation

text-generation-inference

Model card Files Files and versions

instruction-pretrain commited on Sep 11, 2024

Commit

3d05718

·

verified ·

1 Parent(s): da43c31

Update README.md

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -210,16 +210,15 @@ We simply discard the system prompts.
 **To put it all together, the text before tokenization looks like this:**
-`general_instruction_response_text = "<|begin_of_text|>{question} {response}<|end_of_text|>"`
-or
-`instruction_augmented_text = "<|begin_of_text|>{instruction augmented text}<|end_of_text|>"`
 Then, for tokenization, you don't need to add BOS and EOS token ids. The tokenization code looks like this:
-`text_ids = tokenizer(text, add_special_tokens=False, **kwargs).input_ids`
 ## Citation
 If you find our work helpful, please cite us:

 **To put it all together, the text before tokenization looks like this:**
+```python
+general_instruction_response_text = "<|begin_of_text|>{question} {response}<|end_of_text|>"
+instruction_augmented_text = "<|begin_of_text|>{instruction augmented text}<|end_of_text|>"
+```
 Then, for tokenization, you don't need to add BOS and EOS token ids. The tokenization code looks like this:
+```python
+text_ids = tokenizer(text, add_special_tokens=False, **kwargs).input_ids
+```
 ## Citation
 If you find our work helpful, please cite us: