jina-embeddings-v3
Multi-format version of jinaai/jina-embeddings-v3 - optimized for deployment.
Model Information
| Property | Value |
|---|---|
| Base Model | jinaai/jina-embeddings-v3 |
| Task | jina-embedding |
| Type | Text Model |
| Trust Remote Code | True |
Available Versions
| Folder | Format | Description | Size |
|---|---|---|---|
safetensors-fp32/ |
PyTorch FP32 | Baseline, highest accuracy | 2200 MB |
safetensors-fp16/ |
PyTorch FP16 | GPU inference, ~50% smaller | 1108 MB |
Usage
PyTorch (GPU)
from transformers import AutoModel, AutoTokenizer
import torch
# GPU inference with FP16
model = AutoModel.from_pretrained(
"n24q02m/jina-embeddings-v3",
subfolder="safetensors-fp16",
torch_dtype=torch.float16,
trust_remote_code=True
).cuda()
tokenizer = AutoTokenizer.from_pretrained(
"n24q02m/jina-embeddings-v3",
subfolder="safetensors-fp16",
trust_remote_code=True
)
# Task options: "retrieval.query", "retrieval.passage", "separation", "classification", "text-matching"
task = "retrieval.query"
# Inference
texts = ["Hello world", "How are you?"]
with torch.no_grad():
embeddings = model.encode(texts, task=task)
Notes
- SafeTensors FP16 is the primary format for GPU inference
- Model has 5 LoRA adapters for different tasks:
retrieval.query: Embedding for queries in retrievalretrieval.passage: Embedding for documentsseparation: Text separationclassification: Text classificationtext-matching: Text matching
- Requires
trust_remote_code=Trueto load model
License
Apache 2.0 (following the base model's license)
Credits
- Base Model: jinaai/jina-embeddings-v3
- Conversion: PyTorch + SafeTensors
Model tree for n24q02m/jina-embeddings-v3
Base model
jinaai/jina-embeddings-v3