Usage section for TEI (#4)

- docs: added usage section for TEI (48b327cf88cf893c73dcebd372043472c3ff1014)
- docs: update snippet to not use normalization (9238a3b6917121615a94512ec8a84a3d51cf0107)
- fix: use /embed endpoint (6f75acd873b3c46c9e34ed45b061030b24e4e199)
- fix: remove mention of OpenAI API (df3b2f9a723081dc4606908bed889141f2cb5be3)

Files changed (1) hide show

README.md +50 -0

README.md CHANGED Viewed

@@ -86,6 +86,56 @@ embeddings = model.encode(texts, quantization="binary") # Shape: (5, 2560), quan
 </details>
 ## Technical Details

 </details>
+<details>
+<summary>Using Text Embeddings Inference (TEI)</summary>
+> [!NOTE]
+> Text Embeddings Inference v1.9.0 will be released stable soon, in the meantime
+> feel free to use the latest containers or rather via SHA ``.
+> [!IMPORTANT]
+> Currently, only int8-quantized embeddings are available via TEI. Remember to use cosine similarity with unnormalized int8 embeddings.
+- CPU w/ Candle:
+```bash
+docker run -p 8080:80 ghcr.io/huggingface/text-embeddings-inference:cpu-latest --model-id perplexity-ai/pplx-embed-1-4B --auto-truncate
+```
+- CPU w/ ORT (ONNX Runtime):
+```bash
+docker run -p 8080:80 ghcr.io/huggingface/text-embeddings-inference:cpu-latest --model-id onnx-community/pplx-embed-1-4B --auto-truncate
+```
+- GPU w/ CUDA:
+```bash
+docker run --gpus all --shm-size 1g -p 8080:80 ghcr.io/huggingface/text-embeddings-inference:cuda-latest --model-id perplexity-ai/pplx-embed-1-4B --auto-truncate
+```
+> Alternatively, when running in CUDA you can use the architecture / compute capability specific
+> container instead of the `cuda-latest`, as that includes the binaries for Turing, Ampere and
+> Hopper, so using a dedicated container will be lighter e.g., `ampere-latest`.
+And then you can send requests to it via cURL to `/embed`:
+```bash
+curl http://0.0.0.0:8080/embed \
+  -H "Content-Type: application/json" \
+  -d '{
+    "inputs": [
+      "Scientists explore the universe driven by curiosity.",
+      "Children learn through curious exploration.",
+      "Historical discoveries began with curious questions.",
+      "Animals use curiosity to adapt and survive.",
+      "Philosophy examines the nature of curiosity.",
+    ],
+    "normalize" false
+  }'
+```
+</details>
 ## Technical Details