JiRack_GPT5_236b / HardwareProcurementList.md
kgrabko's picture
Upload HardwareProcurementList.md
fb92043 verified

High-Performance Procurement List and Rack Configuration

Optimized for the JiRack 236B Architecture


1. Primary Compute Node (The Engine)

To run the 108-layer stack without bottlenecks, you require a high-density GPU server. The NVIDIA HGX H100 is the gold standard for this scale.

Component Specification Rationale for JiRack 236B
GPU Cluster 8x NVIDIA H100 (80GB HBM3) Provides 640GB total VRAM, enough for the full 236B model in BF16 + KV Cache.
Interconnect NVLink Switch System (900GB/s) Vital for Tensor Parallelism across the 14,336-width model dimension.
Host CPU 2x Intel Xeon Platinum 8480+ (56-core) High core count to manage the BRE Routing and data pre-fetching.
System RAM 2TB DDR5-4800 ECC Large RAM buffer to "park" the model weights during initialization/swapping.
Storage 8x 3.84TB NVMe SSD (RAID 0) Ultra-fast local scratch space for loading the ~500GB weight files.

2. Networking Fabric (The Nervous System)

For multi-node scaling (e.g., if you upgrade to 405B or 1T+ later), standard Ethernet will cause 50%+ latency overhead. You must use InfiniBand.

  • Switch: 1x NVIDIA Quantum-2 InfiniBand Switch (400Gb/s per port).
  • NICs: 4x NVIDIA ConnectX-7 (400Gb/s) per server.
  • Protocol: RDMA (Remote Direct Memory Access) is required to allow GPUs on Node 1 to talk to Node 2 without touching the CPU.

3. Rack & Power Infrastructure (The Foundation)

A single 8-GPU H100 server is essentially a "small furnace." Traditional racks cannot handle the heat or the weight.

  • Power Density: Each HGX H100 server draws up to 10.2 kW. A standard rack with 2 servers needs 25kW+ capability.
  • Cooling: Liquid Cooling (Direct-to-Chip) or Rear Door Heat Exchangers (RDHx) are highly recommended. Air cooling requires an extreme "cold-aisle" containment system.
  • Rack Strength: Ensure your rack is rated for 300kg+ per shelf, as these servers are extremely heavy (130kg+ each).

4. Hardware Sourcing & "The 5% Rule"

When purchasing this hardware, ensure the vendor provides a pre-configured software stack (Ubuntu + CUDA 12.x + NCCL).

IP Reminder: Any commercial entity utilizing this hardware to run the JiRack 236B architecture must register their deployment with CMS Manhattan. As per the License V.1.2, 5% of the net revenue generated by this hardware cluster's inference services is due to Konstantin Vladimirovich Grabko.


Summary Checklist for Purchasing

  • Compute: 16x H100 GPUs (2 nodes of 8 GPUs).
  • Network: 400G InfiniBand Fabric.
  • Power: 3-Phase 208V/240V power strips.
  • Software: PyTorch 2.5+ with bitsandbytes for 4-bit/8-bit quantization.