Model Background

This model, XL-DURel, is trained on ordinal WiC and Binary WiC (Binary data is converted into Ordial WiC see the paper of detials) data and it is optimized using AnglE Loss) (Li & Li, 2023).

For more details, see our paper: XL-DURel: Finetuning Sentence Transformers for Ordinal Word-in-Context Classification

Reproducing Results

To reproduce the results presented in the XL-DURel paper, please follow the instructions in our GitHub repository:XL-DURel Reproduction Instructions

Usage

pip install -U sentence-transformers

✅ Recommended Method — Target Word Encoding

For the ordinal WiC task, the model expects the target word to be marked in each sentence using <t> and </t> tags. This lets the model focus on the contextual embedding of the specific word rather than the full sentence.

from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

model = SentenceTransformer("sachinn1/xl-durel")

# Two uses of the same target word "bank" in different contexts
sentence1 = "She sat on the <t>bank </t> of the river to rest."
sentence2 = "He went to the <t>bank </t> to deposit his savings."

embeddings = model.encode([sentence1, sentence2])

similarity = cosine_similarity([embeddings[0]], [embeddings[1]])[0][0]
print(f"Similarity: {similarity:.4f}")

To make it easy to format your own data for use with this model, we provide a Python library that transforms raw sentences into the <t>.....</t> tagged format expected by XL-DURel.

pip install xl-durel-utils

from xl_durel_utils import tokenize_truncate_decode

sentence = "She sat on the bank of the river to rest."
positions = [10, 15]  # character start and end index of the target word

result = tokenize_truncate_decode(sentence, positions, tokenizer=model, max_seq_len=128)
result → "She sat on the <t>bank </t> of the river to rest."

You can then pass the returned string directly to the SentenceTransformer model as shown in the Recommended Method above.

For a complete working example — including batch encoding, ordinal label mapping, and evaluation on WiC datasets — see xl-durel.ipynb. The easiest way to use this model is with the sentence-transformers library:

If you just need general-purpose sentence embeddings (without target word marking), you can use the model directly with sentence-transformers:

pip install -U sentence-transformers

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sachinn1/xl-durel")

sentences = ["This is an example sentence", "Each sentence is converted"]
embeddings = model.encode(sentences)
print(embeddings)

Note: This method does not use target word marking and is not optimized for the ordinal WiC task. For best results on WiC-style tasks, use the Recommended Method above.

Training

The model was trained with the parameters:

DataLoader:

torch.utils.data.dataloader.DataLoader of length 9369 with parameters:

{'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss:

sentence_transformers.losses.AnglELoss.AnglELoss with parameters:

{'scale': 20.0, 'similarity_fct': 'pairwise_angle_sim'}

Parameters of the fit()-Method:

{
    "epochs": 10,
    "evaluation_steps": 2342,
    "evaluator": "WordTransformer.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'transformers.optimization.AdamW'>",
    "optimizer_params": {
        "lr": 1e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 9369,
    "weight_decay": 0.0
}

Citing & Authors

@inproceedings{yadav-schlechtweg-2025-xl,
    title = "{XL}-{DUR}el: Finetuning Sentence Transformers for Ordinal Word-in-Context Classification",
    author = "Yadav, Sachin  and
      Schlechtweg, Dominik",
    editor = "Inui, Kentaro  and
      Sakti, Sakriani  and
      Wang, Haofen  and
      Wong, Derek F.  and
      Bhattacharyya, Pushpak  and
      Banerjee, Biplab  and
      Ekbal, Asif  and
      Chakraborty, Tanmoy  and
      Singh, Dhirendra Pratap",
    booktitle = "Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics",
    month = dec,
    year = "2025",
    address = "Mumbai, India",
    publisher = "The Asian Federation of Natural Language Processing and The Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-ijcnlp.19/",
    doi = "10.18653/v1/2025.findings-ijcnlp.19",
    pages = "338--351",
    ISBN = "979-8-89176-303-6",
    abstract = "We propose XL-DURel, a finetuned, multilingual Sentence Transformer model optimized for ordinal Word-in-Context classification. We test several loss functions for regression and ranking tasks managing to outperform previous models on ordinal and binary data with a ranking objective based on angular distance in complex space. We further show that binary WiC can be treated as a special case of ordinal WiC and that optimizing models for the general ordinal task improves performance on the more specific binary task. This paves the way for a unified treatment of WiC modeling across different task formulations."
}

Downloads last month: 877

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for sachinn1/xl-durel

XL-DURel: Finetuning Sentence Transformers for Ordinal Word-in-Context Classification

Paper • 2507.14578 • Published Jul 19, 2025