WordNet Word-Sense Embedding Model

This is a sentence-transformers model trained to embed word senses using definitions from WordNet. It maps a word in context (e.g., "I sat on the bank") to the same vector space as its definition.

Model Details

  • Base Model: distilbert-base-uncased
  • Training Data: marksverdhei/wordnet-definitions-en-2021
  • Loss Function: InfoNCE with inter-word negatives (other definitions of the same word).
  • Pooling: Custom WordPooling. The model pools only the tokens corresponding to the target word, not the entire sentence.

Usage

Because this model uses a custom pooling layer and tokenization logic (to identify the target word), you must use the provided word_pooling.py code to load it.

Installation

pip install sentence-transformers

Loading the Model

  1. Download word_pooling.py from this repository.
  2. Load the model using the WordSenseTransformer class.
from word_pooling import WordSenseTransformer

# Load from Hugging Face Hub
model = WordSenseTransformer("marksverdhei/wordnet-sense-embedding")

# Define inputs with the format: "'<word>': <context>"
sentences = [
    "'bank': I sat on the river bank.",
    "'bank': I deposited money at the bank."
]

embeddings = model.encode(sentences)

# Compare with definitions
definitions = [
    "'bank': A sloping land (especially the slope beside a body of water).",
    "'bank': A financial institution that accepts deposits."
]
def_embeddings = model.encode(definitions)

# Compute similarity...

Performance

  • WSD Accuracy (WordNet Test Set): 52.0% (vs Baseline 39.6%)
  • Training Epochs: 1
Downloads last month
4
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support