| 2026-04-09 04:54:01,412 [INFO] Loading faiss with AVX512 support. | |
| 2026-04-09 04:54:01,533 [INFO] Successfully loaded faiss with AVX512 support. | |
| 2026-04-09 04:54:03,332 [INFO] Benchmarking full recompute (50 trials)... | |
| 2026-04-09 04:54:10,527 [INFO] Benchmarking streaming inference (50 trials)... | |
| ============================================================ | |
| STREAMING INFERENCE BENCHMARK | |
| ============================================================ | |
| Context: 256 units, 10 features | |
| Device: cuda:0 | |
| ------------------------------------------------------------ | |
| Full recompute: 128.72 ± 65.50 ms | |
| KV-cached: 133.05 ± 78.01 ms | |
| Speedup: 1.0× | |
| ============================================================ | |
| 2026-04-09 04:54:18,907 [INFO] Results saved to outputs/benchmarks/streaming_results.json | |