Update README.md
Browse files
README.md
CHANGED
|
@@ -34,6 +34,38 @@ The model is intended to be used on segments of **250** characters. Anything els
|
|
| 34 |
|
| 35 |
Emendator Reconstruction: "Diligam te Domine fortitudo mea Dominus firmamentum meum refugium meum liberator meus"
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
If you use this in your work, please cite:
|
| 38 |
```
|
| 39 |
@misc{mccarthy2026Emendator,
|
|
|
|
| 34 |
|
| 35 |
Emendator Reconstruction: "Diligam te Domine fortitudo mea Dominus firmamentum meum refugium meum liberator meus"
|
| 36 |
|
| 37 |
+
To use Emendator, you can load it via the Transformers library:
|
| 38 |
+
|
| 39 |
+
```python
|
| 40 |
+
|
| 41 |
+
from transformers import T5ForConditionalGeneration, AutoTokenizer
|
| 42 |
+
|
| 43 |
+
model_path = 'aimgo/Emendator'
|
| 44 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
| 45 |
+
|
| 46 |
+
model = T5ForConditionalGeneration.from_pretrained(model_path, torch_dtype=torch.bfloat16).to(device)
|
| 47 |
+
|
| 48 |
+
model.eval()
|
| 49 |
+
|
| 50 |
+
enc = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=256).to(self.device)
|
| 51 |
+
|
| 52 |
+
max_input_len = enc["input_ids"].shape[1]
|
| 53 |
+
|
| 54 |
+
with torch.cuda.amp.autocast(dtype=torch.bfloat16, enabled=True):
|
| 55 |
+
outputs = model.generate(
|
| 56 |
+
enc["input_ids"],
|
| 57 |
+
attention_mask=enc["attention_mask"],
|
| 58 |
+
max_new_tokens=max_input_len + 32,
|
| 59 |
+
num_beams=4,
|
| 60 |
+
do_sample=False,
|
| 61 |
+
early_stopping=True,
|
| 62 |
+
repetition_penalty=1.15,
|
| 63 |
+
use_cache=True,
|
| 64 |
+
)
|
| 65 |
+
|
| 66 |
+
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
If you use this in your work, please cite:
|
| 70 |
```
|
| 71 |
@misc{mccarthy2026Emendator,
|