| --- |
| library_name: transformers |
| base_model: |
| - google/mt5-small |
| license: apache-2.0 |
| language: |
| - gl |
| --- |
| |
| # Model Card for mt5-scan-gl-cx |
|
|
| Metrical scansion in Galician (lexical to metrical syllabification). Fine-tuned mT5. |
|
|
| Uses the previous and following input line as context, as in this example from "Á moda" by Filomena Dato. |
|
|
| Input format: `PREV: sin / *fe / nin / cre- / *en- / zas | CUR: *ten / *cen- / tos / de / al- / *ta- / res | NEXT: *che- / os / de / ri- / *que- / zas | OUTPUT: ` |
|
|
| Output for the above: `*ten / *cen- / tos / de al- / *ta- / res` |
|
|
| Use the code below to get started with the model. |
|
|
|
|
| ```python |
| import torch |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| |
| model_name = "compellit/mt5-scan-gl-cx" |
| |
| device = "cuda" if torch.cuda.is_available() else "cpu" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_name) |
| model = AutoModelForSeq2SeqLM.from_pretrained(model_name) |
| |
| text = "PREV: sin / *fe / nin / cre- / *en- / zas | CUR: *ten / *cen- / tos / de / al- / *ta- / res | NEXT: *che- / os / de / ri- / *que- / zas | OUTPUT: " |
| |
| inputs = tokenizer(text, return_tensors="pt") |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_length=256, |
| num_beams=1, |
| do_sample=False |
| ) |
| |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |