This model card is designed for Model 2 from the UNIBA system presented at EVALITA 2026. This version of the model is specifically optimized for Italian crossword solving by exploiting partial answer strings.


Model Card: uniba/cruciverb-it-IT5-partial

Model Details

  • Developed by: Pierpaolo Basile, Department of Computer Science, University of Bari Aldo Moro
  • Model Type: Encoder-Decoder Transformer
  • Language(s): Italian
  • Base Model: IT5 Large
  • Task: Crossword Clue Answering (Cruciverb-IT @ EVALITA 2026)
  • License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Uses

Direct Use

This model is designed to generate candidate answers for Italian crossword clues. It is particularly effective when some characters of the answer are already known (e.g., from intersecting words in a grid).

Out-of-Scope Use

  • Solving crosswords in languages other than Italian.
  • General-purpose question answering outside the specific linguistic constraints of crossword puzzles.

Training Details

Training Data

The model was trained on 16,254,904 examples derived from an original set of 374,766 clue-answer pairs. To simulate realistic solving, data was augmented by generating all possible partial solutions of a given length for each answer using character masking.

Training Procedure

  • Input Format: Plain text instruction: "Trova la soluzione dove _ indica un carattere mancante. Caratteri mancanti: {0}. Lunghezza soluzione: {1}. Soluzione parziale: {2}. Indizio: {3}".

  • Hyperparameters:

  • Learning Rate:

  • Batch Size: 32

  • Weight Decay: 0.01

  • Epochs: 1

  • Hardware: Single NVIDIA RTX A6000 (48GB VRAM).

Evaluation

Testing Data & Metrics

Evaluation was performed on the official Cruciverb-IT validation and test sets using accuracy (acc@1, acc@10) and Mean Reciprocal Rank (MRR).

Results

System acc@1 acc@10 MRR
UNIBA-Model2 0.43 0.59 0.47

Bias, Risks, and Limitations

  • Bottleneck: Performance is lower when no characters are initially known.
  • Data Scarcity: The model relies solely on provided training data and does not use external dictionaries or encyclopedias.

How to Get Started

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("uniba/cruciverb-it-IT5-partial")
model = AutoModelForSeq2SeqLM.from_pretrained("uniba/cruciverb-it-IT5-partial")

input_text = "Trova la soluzione dove _ indica un carattere mancante. Caratteri mancanti: 1. Lunghezza soluzione: 4. Soluzione parziale: i_ri. Indizio: Un passo indietro nel tempo"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) # Output: ieri

Acknowledgments

We acknowledge the support of the PNRR project FAIR - Future AI Research (PE00000013), Spoke 6 - Symbiotic AI (CUP H97G22000210007) under the NRRP MUR program funded by the NextGenerationEU.


license: cc-by-nc-4.0

Downloads last month
20
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for basilepp19/cruciverb-it-IT5-partial

Base model

gsarti/it5-large
Finetuned
(2)
this model

Dataset used to train basilepp19/cruciverb-it-IT5-partial