PhraseBERT ONNX

ONNX export of whaleloops/phrase-bert for lightweight inference using ONNX Runtime — no PyTorch or Transformers required.

Model Details

Base model: whaleloops/phrase-bert (BERT-base, 12 layers, 768 hidden dim)
Pooling: Mean pooling (attention-mask weighted)
Format: ONNX
Size: ~416 MB

Usage

Install dependencies (no torch/transformers needed)

pip install onnxruntime tokenizers numpy

Download and run

from huggingface_hub import snapshot_download

# Download the model
model_dir = snapshot_download("langminer/phrase-bert-onnx")

import numpy as np
import onnxruntime as ort
from tokenizers import Tokenizer

# Load model and tokenizer
session = ort.InferenceSession(f"{model_dir}/model.onnx", providers=["CPUExecutionProvider"])
tokenizer = Tokenizer.from_file(f"{model_dir}/tokenizer.json")
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]")
tokenizer.enable_truncation(max_length=512)

# Encode phrases
phrases = ["play an active role", "participate actively", "machine learning"]
encodings = tokenizer.encode_batch(phrases)

input_ids = np.array([e.ids for e in encodings], dtype=np.int64)
attention_mask = np.array([e.attention_mask for e in encodings], dtype=np.int64)
token_type_ids = np.array([e.type_ids for e in encodings], dtype=np.int64)

# Run inference
outputs = session.run(None, {
    "input_ids": input_ids,
    "attention_mask": attention_mask,
    "token_type_ids": token_type_ids,
})
token_embeddings = outputs[0]  # (batch, seq_len, 768)

# Mean pooling
mask = attention_mask[:, :, np.newaxis].astype(np.float32)
embeddings = np.sum(token_embeddings * mask, axis=1) / np.sum(mask, axis=1)

print(embeddings.shape)  # (3, 768)

Citation

@inproceedings{wang2021phrase,
  title={Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration},
  author={Wang, Shufan and Thompson, Laure and Iyyer, Mohit},
  booktitle={EMNLP},
  year={2021}
}

Downloads last month: 11

Model tree for langminer/phrase-bert-onnx

Base model

whaleloops/phrase-bert

Quantized

(1)

this model