JiRack_GPT5_7b / README.md
kgrabko's picture
Update README.md
2b35bcf verified
---
license: other
license_name: cms-manhattan-jirack-v1.2
license_link: LICENSE
---
# JiRack Dense: Scalable Transformer Series (7B, 13B, 70B)
**Author:** Konstantin Vladimirovich Grabko
**Organization:** CMS Manhattan
**Status:** PATENT PENDING / PROPRIETARY TECHNOLOGY
**Invention Class:** High-Resolution Dense Architecture (V.1.2)
---
# JiRack GPT 5 class
## ๐Ÿš€ The Scalable Frontier
The JiRack Dense series provides a unified architectural framework for state-of-the-art language modeling. By utilizing **SWA Fusion** and **Buffered Routing**, these models achieve significantly higher throughput than standard Llama-based architectures.
### Available Configurations
| Model | Parameters | Target Hardware | Optimization |
| :--- | :--- | :--- | :--- |
| **JiRack 7B** | 7.2 Billion | 1x RTX 3090/4090 | High-speed Edge Reasoning |
| **JiRack 13B** | 13.5 Billion | 1x A100 (40GB) | Advanced Logical Synthesis |
| **JiRack 70B** | 70.8 Billion | 4x - 8x H100 | Enterprise Flagship Performance |
---
## ๐Ÿ›  Proprietary Core Technologies
### 1. SWA Fusion (SwiGLU-Attention)
The core of JiRack's speed. By fusing the Attention and SwiGLU FFN layers into a single computational kernel, we eliminate redundant memory R/W cycles.
* **Benefit:** 30% reduction in VRAM latency.
* **Architecture:** Integrated compute graph for AMD ROCm and NVIDIA CUDA.
### 2. BRE (Buffered Routing Embedding)
A hardware-aware embedding system designed for HBM3/4. BRE pre-fetches token weights into a local ring buffer based on predictive token sequencing.
* **Benefit:** Zero-latency embedding lookups even at maximum sequence lengths.
### 3. GQA Scaling
Optimized Grouped-Query Attention ratios (4:1 for 7B, 8:1 for 70B) to ensure the KV-cache remains manageable during long-context operations without degrading reasoning quality.
---
## โš–๏ธ Legal & Licensing Notice
**NOTICE: PATENT PENDING**
This repository contains proprietary technology owned by **Konstantin Vladimirovich Grabko**. Access is granted under the following conditions:
1. **Commercial Use:** Requires a 5% Net Revenue royalty agreement.
2. **IP Protection:** No reverse engineering of SWA kernels or BRE routing logic is permitted.
3. **No "Patent-Around":** Licensees agree not to file IP claims based on the methods described herein.
4. **Attribution:** Any derivative work must cite:
*"Powered by CMS Manhattan JiRack Technology. Inventor: Konstantin Vladimirovich Grabko."*
Refer to `license_dense.md` for full legal documentation.
---
## ๐Ÿ“ฆ Setup & Training
### Environment
* **Framework:** PyTorch 2.3+
* **Accelerator:** AMD ROCm 6.0+ or NVIDIA CUDA 12.1+
* **Distributed:** Integrated support for DeepSpeed ZeRO-2/3.
### Quick Start
To initialize a model from the factory:
```python
from JiRack_Dense_Factory import get_jirack_config, JiRackPyTorch
# Initialize 13B configuration
config = get_jirack_config("13b")
model = JiRackPyTorch(config)