Binary MWE Detection with DeBERTa

DeBERTa-v3-large fine-tuned for multiword expression (MWE) identification. Detects both continuous MWEs (kick the bucket) and discontinuous MWEs (look [the information] up).

📄 Paper: "Binary Token-Level Classification with DeBERTa for All-Type MWE Identification" (EACL 2026 Findings)
💻 Code: github.com/DiegoRossini/binary-mwe-detection

Approach

Instead of traditional BIO sequence labeling, we predict three independent binary labels per token: START, END, and INSIDE. This captures whether each token begins, ends, or is inside an MWE.

Example: "looked the information up" where {looked, up} is a discontinuous MWE:

Token	START	END
looked	✓
the
information
up		✓

This formulation naturally handles discontinuous patterns and provides richer training signals than span-level labeling.

Results

Evaluated on CoAM. We outperform the previous state-of-the-art (Qwen-72B) by +12 F1 points with 165× fewer parameters.

Model	Parameters	F1	Continuous F1	Discontinuous F1
Ours	435M	69.8%	75.9%	29.7%
Qwen-72B (prev. SOTA)	72B	57.8%	57.3%	17.1%

Installation

pip install transformers torch spacy networkx
python -m spacy download en_core_web_lg

Usage

from transformers import AutoModel

model = AutoModel.from_pretrained("DiegoRossini/mwe-detection-deberta", trust_remote_code=True)

# Continuous MWE
mwes = model.detect("They made up their minds.")
print(mwes)  # ['made up']

# Discontinuous MWE
mwes = model.detect("I ran into an old friend yesterday.")
print(mwes)  # ['ran into']

# Detailed output with scores
mwes = model.detect("He kicked the bucket.", return_details=True)

Training

Base model: DeBERTa-v3-large
Dataset: CoAM (780 train / 521 test)
Features: NP chunking + dependency distances (via spaCy)
Augmentation: 30% oversampling
Thresholds: τ_start=0.5, τ_end=0.6, τ_inside=0.2

Citation

@inproceedings{rossini2026binary,
    title = "Binary Token-Level Classification with {DeBERTa} for All-Type {MWE} Identification",
    author = "Rossini, Diego and van der Plas, Lonneke",
    booktitle = "Findings of EACL 2026",
    year = "2026"
}

Downloads last month: 23

Safetensors

Model size

0.4B params

Tensor type

F32

DiegoRossini
/

mwe-detection-deberta