# Multi-GPU RTX PRO 4000 Blackwell workstation cards do not have NVLink. Multi-GPU inference uses PCIe. Measurement guidance: - Report `gpu_count`. - Say whether the backend used layer split, tensor parallelism, expert parallelism, pipeline parallelism, or independent model instances. - Keep single-GPU and multi-GPU rows separate. - Do not infer multi-GPU benefit from single-stream tok/s alone; measure concurrent aggregate throughput when that is the real workload. - For TP=2 or all-GPU runs, stop other GPU services first and verify they restart after the benchmark window.