Update README.md
Browse files
README.md
CHANGED
|
@@ -97,7 +97,35 @@ This would have the form {some prelude text here} \<INFILLING LOCATION\> {some t
|
|
| 97 |
|
| 98 |
The way to perform infilling generation would be via placing the input text into this format:
|
| 99 |
|
| 100 |
-
\<SUF\> {some text following cursor} \<PRE\> {some prelude text here} \<MID\> ...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
|
| 102 |
|
| 103 |
## Intended Uses and Limitations
|
|
|
|
| 97 |
|
| 98 |
The way to perform infilling generation would be via placing the input text into this format:
|
| 99 |
|
| 100 |
+
\<SUF\> {some text following cursor} \<PRE\> {some prelude text here} \<MID\> ...
|
| 101 |
+
|
| 102 |
+
|
| 103 |
+
language model output is generated after \<MID\> token!
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
As a concrete example, here is a code snippet that should allow a model to perform infilling:
|
| 107 |
+
|
| 108 |
+
```python
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
tokenizer = AutoTokenizer.from_pretrained("CarperAI/FIM-NeoX-1.3B")
|
| 115 |
+
model = AutoModelForCausalLM.from_pretrained("CarperAI/FIM-NeoX-1.3B")
|
| 116 |
+
|
| 117 |
+
prelude = "this is some text preceding the cursor,"
|
| 118 |
+
suffix = "and this is some text after it."
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
model_tokenized_input = [50253, *tokenizer(suffix), 50254, *tokenizer(prefix), 50255]
|
| 122 |
+
|
| 123 |
+
infilled = model.generate(model_tokenized_input)
|
| 124 |
+
|
| 125 |
+
|
| 126 |
+
```
|
| 127 |
+
|
| 128 |
+
We are working on making a better interface for this in future model releases or updates to the tokenizer.
|
| 129 |
|
| 130 |
|
| 131 |
## Intended Uses and Limitations
|