Binary MWE Detection with DeBERTa

DeBERTa-v3-large fine-tuned for multiword expression (MWE) identification. Detects both continuous MWEs (kick the bucket) and discontinuous MWEs (look [the information] up).

๐Ÿ“„ Paper: "Binary Token-Level Classification with DeBERTa for All-Type MWE Identification" (EACL 2026 Findings)
๐Ÿ’ป Code: github.com/DiegoRossini/binary-mwe-detection

Approach

Instead of traditional BIO sequence labeling, we predict three independent binary labels per token: START, END, and INSIDE. This captures whether each token begins, ends, or is inside an MWE.

Example: "looked the information up" where {looked, up} is a discontinuous MWE:

Token START END INSIDE
looked โœ“
the
information
up โœ“

This formulation naturally handles discontinuous patterns and provides richer training signals than span-level labeling.

Results

Evaluated on CoAM. We outperform the previous state-of-the-art (Qwen-72B) by +12 F1 points with 165ร— fewer parameters.

Model Parameters F1 Continuous F1 Discontinuous F1
Ours 435M 69.8% 75.9% 29.7%
Qwen-72B (prev. SOTA) 72B 57.8% 57.3% 17.1%

Installation

pip install transformers torch spacy networkx
python -m spacy download en_core_web_lg

Usage

from transformers import AutoModel

model = AutoModel.from_pretrained("DiegoRossini/mwe-detection-deberta", trust_remote_code=True)

# Continuous MWE
mwes = model.detect("They made up their minds.")
print(mwes)  # ['made up']

# Discontinuous MWE
mwes = model.detect("I ran into an old friend yesterday.")
print(mwes)  # ['ran into']

# Detailed output with scores
mwes = model.detect("He kicked the bucket.", return_details=True)

Training

  • Base model: DeBERTa-v3-large
  • Dataset: CoAM (780 train / 521 test)
  • Features: NP chunking + dependency distances (via spaCy)
  • Augmentation: 30% oversampling
  • Thresholds: ฯ„_start=0.5, ฯ„_end=0.6, ฯ„_inside=0.2

Citation

@inproceedings{rossini2026binary,
    title = "Binary Token-Level Classification with {DeBERTa} for All-Type {MWE} Identification",
    author = "Rossini, Diego and van der Plas, Lonneke",
    booktitle = "Findings of EACL 2026",
    year = "2026"
}
Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train DiegoRossini/mwe-detection-deberta