Spaces:
Running
Running
| # Performance Guide | |
| - Embedding dimension: 512 cosine-normalized vectors. | |
| - Enable ONNX Runtime with CPU/GPU providers in air-gapped environment. | |
| - Use TensorRT plan in GPU deployments via local `trtexec` conversion. | |
| - Keep vector index snapshot on SSD-backed storage. | |
| - Target p95 search latency < 150ms with warmed index and local network. | |