jina-embeddings-v3

Multi-format version of jinaai/jina-embeddings-v3 - optimized for deployment.

Model Information

Property	Value
Base Model	jinaai/jina-embeddings-v3
Task	jina-embedding
Type	Text Model
Trust Remote Code	True

Available Versions

Folder	Format	Description	Size
`safetensors-fp32/`	PyTorch FP32	Baseline, highest accuracy	2200 MB
`safetensors-fp16/`	PyTorch FP16	GPU inference, ~50% smaller	1108 MB

Usage

PyTorch (GPU)

from transformers import AutoModel, AutoTokenizer
import torch

# GPU inference with FP16
model = AutoModel.from_pretrained(
    "n24q02m/jina-embeddings-v3",
    subfolder="safetensors-fp16",
    torch_dtype=torch.float16,
    trust_remote_code=True
).cuda()
tokenizer = AutoTokenizer.from_pretrained(
    "n24q02m/jina-embeddings-v3",
    subfolder="safetensors-fp16",
    trust_remote_code=True
)

# Task options: "retrieval.query", "retrieval.passage", "separation", "classification", "text-matching"
task = "retrieval.query"

# Inference
texts = ["Hello world", "How are you?"]
with torch.no_grad():
    embeddings = model.encode(texts, task=task)

Notes

SafeTensors FP16 is the primary format for GPU inference
Model has 5 LoRA adapters for different tasks:
- retrieval.query: Embedding for queries in retrieval
- retrieval.passage: Embedding for documents
- separation: Text separation
- classification: Text classification
- text-matching: Text matching
Requires trust_remote_code=True to load model

License

Apache 2.0 (following the base model's license)

Credits

Base Model: jinaai/jina-embeddings-v3
Conversion: PyTorch + SafeTensors

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for n24q02m/jina-embeddings-v3

Base model

jinaai/jina-embeddings-v3

Finetuned

(30)

this model