metaphor-cat-roberta-base-weights

This model is a fine-tuned version of projecte-aina/roberta-base-ca-v2 for metaphor detection in Catalan text using the dataset metaphor-catalan.

It performs token classification, identifying which tokens belong to metaphorical expressions using a BIO tagging scheme.

Evaluation results on the validation set:

Loss: 0.7232
Precision: 0.7188
Recall: 0.5476
F1: 0.6216
Accuracy: 0.9665

Model description

This model is a RoBERTa-based transformer for Catalan NLP fine-tuned for token-level metaphor detection.

The model predicts whether each token in a sentence belongs to a metaphorical expression using the following BIO labeling scheme:

O – token is not part of a metaphor
B-METAPHOR – beginning of a metaphorical expression
I-METAPHOR – continuation of a metaphorical expression

The task is framed as token classification, where each token receives a label indicating whether it is part of a metaphor span.

Because metaphor tokens are significantly less frequent than literal tokens, class-weighted loss was applied during training to mitigate class imbalance.

This model can support research on figurative language detection in Catalan, computational linguistics experiments, and NLP pipelines requiring semantic analysis of figurative language.

Intended uses & limitations

Intended uses

This model can be used for:

Detecting metaphorical expressions in Catalan text
Linguistic analysis of figurative language
Computational linguistics research
Digital humanities and literary analysis
Supporting annotation pipelines for metaphor datasets
Preprocessing for downstream NLP tasks involving figurative language

Typical usage scenarios include:

Annotating metaphorical language in Catalan corpora
Supporting research on metaphor detection
Assisting literary or stylistic analysis tools

Limitations

The training dataset is relatively small, which limits generalization.
The model may not perform well on unseen domains, such as social media or informal text.
Performance may degrade on:
- highly creative language
- poetry
- domain-specific corpora
Predictions are produced at the token level, meaning additional processing may be required to reconstruct full metaphor spans.
Although class weighting helps mitigate imbalance, metaphor detection remains a challenging task, and some metaphors may still be missed.

The model should therefore be used as an assistive tool rather than a definitive annotation system.

Training and evaluation data

Training dataset:
metaphor-catalan

This dataset contains Catalan sentences annotated for metaphorical expressions using token-level BIO labels.

Each example contains:

tokens — tokenized sentence
tags — BIO labels indicating metaphor spans

Label scheme used during training:

Label	Description
O	Non-metaphorical token
B-METAPHOR	Beginning of metaphor
I-METAPHOR	Continuation of metaphor

The dataset exhibits strong class imbalance, with metaphor tokens occurring much less frequently than literal tokens. To address this, class weights were applied during training.

Example label distribution used during training:

O: 6089 tokens
B-METAPHOR: 325 tokens
I-METAPHOR: extremely rare / absent in some splits

Training procedure

Preprocessing

The dataset was tokenized using the tokenizer from projecte-aina/roberta-base-ca-v2.
Labels were aligned with subword tokens.
When a word was split into multiple subword tokens:
- the first token retained the label
- subsequent tokens were ignored during loss computation

Training setup

Training was performed using the Hugging Face Transformers Trainer API.

Key elements:

RoBERTa base model with a token classification head
Class-weighted loss to mitigate class imbalance
Evaluation using the seqeval metric library
Evaluation and checkpointing performed at the end of each epoch

Training hyperparameters

Learning rate: 2e-5
Train batch size: 4
Evaluation batch size: 4
Gradient accumulation steps: 2
Effective batch size: 8
Epochs: 10
Weight decay: 0.1
LR scheduler: linear
Warmup steps: 50
Logging steps: 20
Optimizer: AdamW
Mixed precision: AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
—	1	66	0.7492	0.2000	0.0476	0.0769	0.9426
0.8617	2	132	0.5231	0.2238	0.7619	0.3459	0.8553
0.8617	3	198	0.5649	0.3571	0.5952	0.4464	0.9258
0.4759	4	264	0.6743	0.6286	0.5238	0.5714	0.9605
0.3202	5	330	0.6171	0.6970	0.5476	0.6133	0.9653
0.3202	6	396	0.6861	0.6875	0.5238	0.5946	0.9641
0.2365	7	462	0.6396	0.6857	0.5714	0.6234	0.9653
0.1962	8	528	0.6864	0.7273	0.5714	0.6400	0.9677
0.1962	9	594	0.7467	0.6875	0.5238	0.5946	0.9641
0.1589	10	660	0.7232	0.7188	0.5476	0.6216	0.9665

Framework versions

Transformers: 4.57.3
PyTorch: 2.9.0
Datasets: 4.0.0
Tokenizers: 0.22.1

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for mariadelcarmenramirez/metaphor-cat-roberta-base-weights

Base model

projecte-aina/roberta-base-ca-v2

Finetuned

(13)

this model

mariadelcarmenramirez
/

metaphor-cat-roberta-base-weights