JiRack Dense: Ultra-Scale Transformer Architecture (140B - 405B+)
Author: Konstantin Vladimirovich Grabko
Organization: CMS Manhattan
Status: Patent Pending / Proprietary Technology
Version: 1.2 (Dense High-Precision Edition)
JiRack GPT 5 class
π Overview
JiRack Dense is a high-performance transformer architecture designed to bridge the gap between 100B and 500B+ parameter models. Unlike traditional architectures that suffer from memory bottlenecks, JiRack utilizes proprietary SWA Fusion and BRE Routing to maximize throughput on AMD ROCm (Instinct MI300/400) and NVIDIA Hopper/Blackwell hardware.
This repository contains the core logic for the Dense (Non-Ternary) versions of JiRack, optimized for high-fidelity reasoning and stable training on massive datasets like The Pile.
π Key Innovations
1. SwiGLU-Attention (SWA) Fusion
A unified compute kernel that merges the Feed-Forward Network (FFN) and Multi-Head Attention (MHA) passes.
- Impact: 30% reduction in VRAM I/O and faster training steps.
2. Buffered Routing Embedding (BRE)
A predictive HBM management system that pre-fetches embedding weights into high-speed buffers.
- Impact: Eliminates GPU "starvation" during inference and allows for context windows up to 128k.
3. Frontier Scaling
Optimized configurations for extreme scales:
- 140B: The efficiency leader for enterprise clusters.
- 236B: Balanced frontier performance.
- 405B+: SOTA-level reasoning capabilities.
π Repository Structure
/models- Architecture definitions (JiRackPyTorch_GPT5_class_Xb.py)./docs- Patent claims, technical specifications, and BRE algorithms.load_small_pile_GPT5_1b.py- Standard training script with DeepSpeed support.LICENSE- Commercial proprietary license terms.
βοΈ Licensing & Legal
PROPRIETARY TECHNOLOGY - PATENT PENDING
This software is licensed under the CMS Manhattan JiRack V.1.2 License.
- Commercial Use: Requires a royalty-bearing agreement (5% Net Revenue).
- Restrictions: No reverse engineering of SWA kernels; no Knowledge Distillation for non-JiRack models.
- Attribution: Commercial products must state: "Powered by CMS Manhattan JiRack Technology by Konstantin Vladimirovich Grabko."
Refer to license_dense.md for the full legal text.
π Performance Targets
| Model Scale | Target Hardware | Precision | Optimized Interconnect |
|---|---|---|---|
| 140B | 8x A100/H100 | BF16 | NVLink / Infinity Fabric |
| 236B | 16x H100 | BF16 | 800Gbps InfiniBand |
| 405B+ | 32x H100/H200 | BF16 | Ultra-Ethernet / RoCE v2 |
π§ Contact
For licensing inquiries, enterprise deployment, or technical partnership:
Konstantin Vladimirovich Grabko Email: grabko@cmsmanhattan.com
Phone: +1 (516) 777-0945