Scansion Models
Collection
Models for automatic metrical scansion of poetry in Galician. Best in the series is byt5-scansion-gl-cx. • 6 items • Updated
Metrical scansion in Galician (lexical to metrical syllabification). Fine-tuned byT5.
Operates on a single line (without addidtional context lines, unlike the models ending with -cx in this collection.
Input format: E / os / *her- / mos / re- / ver- / *de- / cen / do / es- / *pri- / to / on- / de / mo- / *ra- / ren
Output for the above: E os / *her- / mos / re- / ver- / *de- / cen / do es- / *pri- / to on- / de / mo- / *ra- / ren
Use the code below to get started with the model.
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "compellit/mt5-scan-gl-sg"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
text = "E / os / *her- / mos / re- / ver- / *de- / cen / do / es- / *pri- / to / on- / de / mo- / *ra- / ren"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=128,
num_beams=1,
do_sample=False
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Base model
google/mt5-small