YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
CT2 INT8 version of intfloat/multilingual-e5-large
This repository contains a CTranslate2 INT8 quantized version of intfloat/multilingual-e5-large for faster inference.
Usage
import ctranslate2
import numpy as np
from transformers import AutoTokenizer
model_id = "YOUR_USERNAME/multilingual-e5-large-ct2-int8"
device = "cpu" # or "cuda"
# Load tokenizer (now from the same repo!)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Load CT2 model
translator = ctranslate2.Translator(model_id, device=device)
texts = [
"query: how tall is the Eiffel Tower?",
"passage: The Eiffel Tower is 330 metres tall."
]
# Tokenize
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="np")
input_ids = inputs["input_ids"].astype(np.int32)
# Convert to CT2 format
tokens = [[str(t) for t in seq] for seq in input_ids]
# Run inference
results = translator.forward_batch(source=tokens, return_log_probs=False)
# Extract embeddings with mean pooling
embeddings = []
for result in results:
hidden_states = np.array(result.last_hidden_state)
embedding = np.mean(hidden_states, axis=0)
embeddings.append(embedding)
embeddings = np.array(embeddings)
# Normalize
norms = np.linalg.norm(embeddings, axis=1, keepdims=True)
embeddings = embeddings / norms
# Compute similarity
sim = embeddings[0] @ embeddings[1]
print("Cosine similarity:", round(float(sim), 4))
Files
model.bin: CT2 quantized model weightsct2_config.json: CT2 model configurationconfig.json,tokenizer*.json,vocab.*: Original tokenizer files- All other files needed for tokenization
Original Model
Based on: intfloat/multilingual-e5-large
Requirements
pip install ctranslate2 transformers torch numpy
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support