configs / multi-gpu /README.md
Jakal-au's picture
Publish sanitized RTX PRO 4000 configs
af6a068 verified
|
Raw
History Blame Contribute Delete
590 Bytes

Multi-GPU

RTX PRO 4000 Blackwell workstation cards do not have NVLink. Multi-GPU inference uses PCIe.

Measurement guidance:

  • Report gpu_count.
  • Say whether the backend used layer split, tensor parallelism, expert parallelism, pipeline parallelism, or independent model instances.
  • Keep single-GPU and multi-GPU rows separate.
  • Do not infer multi-GPU benefit from single-stream tok/s alone; measure concurrent aggregate throughput when that is the real workload.
  • For TP=2 or all-GPU runs, stop other GPU services first and verify they restart after the benchmark window.