license: apache-2.0
pipeline_tag: feature-extraction
tags:
- feature-extraction
- sentence-similarity
- conteb
- contextual-embeddings
language:
- multilingual
pplx-embed-1: Diffusion-LM for Dense and Contextual Retrieval
pplx-embed-1 and pplx-embed-1-context are state-of-the-art text embedding models optimized for real-world, web-scale retrieval tasks.
- Use
pplx-embed-1for independent text embedding (queries, documents, semantic search) - Use
pplx-embed-1-contextfor document chunks in RAG systems where surrounding context matters
Models
| Model | Dimensions | Context | MRL | Quantization | Instruction | Pooling |
|---|---|---|---|---|---|---|
pplx-embed-1-0.6B |
1024 | 32K | Yes | INT8/BINARY | No | Mean |
pplx-embed-1-4B |
2560 | 32K | Yes | INT8/BINARY | No | Mean |
pplx-embed-1-context-0.6B |
1024 | 32K | Yes | INT8/BINARY | No | Mean |
pplx-embed-1-context-4B |
2560 | 32K | Yes | INT8/BINARY | No | Mean |
All models are built on diffusion continued pre-trained Qwen3 at Perplexity AI.
Many modern embedding models rely on instruction tuning, where users prepend an instruction string to the text being embedded. This can yield a 2%-3% lift on benchmarks, but it also introduces prompt-selection overhead and can make indexing pipelines brittle (small instruction changes can shift embedding space). We deliberately avoid this requirement: you can embed the text you want to index directly, without having to choose or maintain an instruction prefix.
Usage
Via API (Contextualized Embeddings)
curl -X POST https://api.perplexity.ai/v1/contextualizedembeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"inputs": [
[
"Curiosity begins in childhood with endless questions about the world.",
"As we grow, curiosity drives us to explore new ideas and challenge assumptions.",
"Scientific breakthroughs often start with a simple curious question."
],
[
"The curiosity rover explores Mars, searching for signs of ancient life.",
"Each discovery on Mars sparks new questions about our place in the universe."
]
],
"model": "pplx-embed-1-context-4B"
}'
Using Transformers
from transformers import AutoModel
model_ctx = AutoModel.from_pretrained(
"perplexity-ai/pplx-embed-1-context-4B",
trust_remote_code=True
)
doc_chunks = [
[
"Curiosity begins in childhood with endless questions about the world.",
"As we grow, curiosity drives us to explore new ideas.",
"Scientific breakthroughs often start with a curious question."
],
[
"The curiosity rover explores Mars searching for ancient life.",
"Each discovery on Mars sparks new questions about the universe."
]
]
# Returns list of numpy arrays (one per document)
# embeddings[0].shape = (3, 1024), embeddings[1].shape = (2, 1024)
embeddings = model_ctx.encode(doc_chunks)
Technical Details
For comprehensive technical details and evaluation results, see our paper on arXiv.
Contact
- Website: https://perplexity.ai
- API Support: api-support@perplexity.ai
