WordNet Word-Sense Embedding Model

This is a sentence-transformers model trained to embed word senses using definitions from WordNet. It maps a word in context (e.g., "I sat on the bank") to the same vector space as its definition.

Model Details

Base Model: distilbert-base-uncased
Training Data: marksverdhei/wordnet-definitions-en-2021
Loss Function: InfoNCE with inter-word negatives (other definitions of the same word).
Pooling: Custom WordPooling. The model pools only the tokens corresponding to the target word, not the entire sentence.

Usage

Because this model uses a custom pooling layer and tokenization logic (to identify the target word), you must use the provided word_pooling.py code to load it.

Installation

pip install sentence-transformers

Loading the Model

Download word_pooling.py from this repository.
Load the model using the WordSenseTransformer class.

from word_pooling import WordSenseTransformer

# Load from Hugging Face Hub
model = WordSenseTransformer("marksverdhei/wordnet-sense-embedding")

# Define inputs with the format: "'<word>': <context>"
sentences = [
    "'bank': I sat on the river bank.",
    "'bank': I deposited money at the bank."
]

embeddings = model.encode(sentences)

# Compare with definitions
definitions = [
    "'bank': A sloping land (especially the slope beside a body of water).",
    "'bank': A financial institution that accepts deposits."
]
def_embeddings = model.encode(definitions)

# Compute similarity...

Performance

WSD Accuracy (WordNet Test Set): 52.0% (vs Baseline 39.6%)
Training Epochs: 1

Downloads last month: 4

Safetensors

Model size

66.4M params

Tensor type

F32