compellit
/

mt5-scansion-gl-cx

text2text-generation

Model card Files Files and versions

mt5-scansion-gl-cx / README.md

pruizf's picture

Update README.md

2eb6e22 verified 17 days ago

|

history blame contribute delete

1.32 kB

	---
	library_name: transformers
	base_model:
	- google/mt5-small
	license: apache-2.0
	language:
	- gl
	---

	# Model Card for mt5-scan-gl-cx

	Metrical scansion in Galician (lexical to metrical syllabification). Fine-tuned mT5.

	Uses the previous and following input line as context, as in this example from "Á moda" by Filomena Dato.

	Input format: `PREV: sin / fe / nin / cre- / en- / zas \| CUR: ten / cen- / tos / de / al- / ta- / res \| NEXT: che- / os / de / ri- / *que- / zas \| OUTPUT: `

	Output for the above: `ten / cen- / tos / de al- / *ta- / res`

	Use the code below to get started with the model.


	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	model_name = "compellit/mt5-scan-gl-cx"

	device = "cuda" if torch.cuda.is_available() else "cpu"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

	text = "PREV: sin / fe / nin / cre- / en- / zas \| CUR: ten / cen- / tos / de / al- / ta- / res \| NEXT: che- / os / de / ri- / *que- / zas \| OUTPUT: "

	inputs = tokenizer(text, return_tensors="pt")

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_length=256,
	num_beams=1,
	do_sample=False
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```