High-Performance Procurement List and Rack Configuration
Optimized for the JiRack 236B Architecture
1. Primary Compute Node (The Engine)
To run the 108-layer stack without bottlenecks, you require a high-density GPU server. The NVIDIA HGX H100 is the gold standard for this scale.
| Component | Specification | Rationale for JiRack 236B |
|---|---|---|
| GPU Cluster | 8x NVIDIA H100 (80GB HBM3) | Provides 640GB total VRAM, enough for the full 236B model in BF16 + KV Cache. |
| Interconnect | NVLink Switch System (900GB/s) | Vital for Tensor Parallelism across the 14,336-width model dimension. |
| Host CPU | 2x Intel Xeon Platinum 8480+ (56-core) | High core count to manage the BRE Routing and data pre-fetching. |
| System RAM | 2TB DDR5-4800 ECC | Large RAM buffer to "park" the model weights during initialization/swapping. |
| Storage | 8x 3.84TB NVMe SSD (RAID 0) | Ultra-fast local scratch space for loading the ~500GB weight files. |
2. Networking Fabric (The Nervous System)
For multi-node scaling (e.g., if you upgrade to 405B or 1T+ later), standard Ethernet will cause 50%+ latency overhead. You must use InfiniBand.
- Switch: 1x NVIDIA Quantum-2 InfiniBand Switch (400Gb/s per port).
- NICs: 4x NVIDIA ConnectX-7 (400Gb/s) per server.
- Protocol: RDMA (Remote Direct Memory Access) is required to allow GPUs on Node 1 to talk to Node 2 without touching the CPU.
3. Rack & Power Infrastructure (The Foundation)
A single 8-GPU H100 server is essentially a "small furnace." Traditional racks cannot handle the heat or the weight.
- Power Density: Each HGX H100 server draws up to 10.2 kW. A standard rack with 2 servers needs 25kW+ capability.
- Cooling: Liquid Cooling (Direct-to-Chip) or Rear Door Heat Exchangers (RDHx) are highly recommended. Air cooling requires an extreme "cold-aisle" containment system.
- Rack Strength: Ensure your rack is rated for 300kg+ per shelf, as these servers are extremely heavy (130kg+ each).
4. Hardware Sourcing & "The 5% Rule"
When purchasing this hardware, ensure the vendor provides a pre-configured software stack (Ubuntu + CUDA 12.x + NCCL).
IP Reminder: Any commercial entity utilizing this hardware to run the JiRack 236B architecture must register their deployment with CMS Manhattan. As per the License V.1.2, 5% of the net revenue generated by this hardware cluster's inference services is due to Konstantin Vladimirovich Grabko.
Summary Checklist for Purchasing
- Compute: 16x H100 GPUs (2 nodes of 8 GPUs).
- Network: 400G InfiniBand Fabric.
- Power: 3-Phase 208V/240V power strips.
- Software: PyTorch 2.5+ with
bitsandbytesfor 4-bit/8-bit quantization.