CMSManhattan
/

JiRack_GPT5_7b

Model card Files Files and versions

JiRack_GPT5_7b / README.md

kgrabko's picture

Update README.md

2b35bcf verified 3 months ago

|

history blame contribute delete

2.95 kB

	---
	license: other
	license_name: cms-manhattan-jirack-v1.2
	license_link: LICENSE
	---

	# JiRack Dense: Scalable Transformer Series (7B, 13B, 70B)

	Author: Konstantin Vladimirovich Grabko
	Organization: CMS Manhattan
	Status: PATENT PENDING / PROPRIETARY TECHNOLOGY
	Invention Class: High-Resolution Dense Architecture (V.1.2)

	---
	# JiRack GPT 5 class

	## 🚀 The Scalable Frontier

	The JiRack Dense series provides a unified architectural framework for state-of-the-art language modeling. By utilizing SWA Fusion and Buffered Routing, these models achieve significantly higher throughput than standard Llama-based architectures.

	### Available Configurations

	\| Model \| Parameters \| Target Hardware \| Optimization \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| JiRack 7B \| 7.2 Billion \| 1x RTX 3090/4090 \| High-speed Edge Reasoning \|
	\| JiRack 13B \| 13.5 Billion \| 1x A100 (40GB) \| Advanced Logical Synthesis \|
	\| JiRack 70B \| 70.8 Billion \| 4x - 8x H100 \| Enterprise Flagship Performance \|

	---

	## 🛠 Proprietary Core Technologies

	### 1. SWA Fusion (SwiGLU-Attention)
	The core of JiRack's speed. By fusing the Attention and SwiGLU FFN layers into a single computational kernel, we eliminate redundant memory R/W cycles.
	* Benefit: 30% reduction in VRAM latency.
	* Architecture: Integrated compute graph for AMD ROCm and NVIDIA CUDA.



	### 2. BRE (Buffered Routing Embedding)
	A hardware-aware embedding system designed for HBM3/4. BRE pre-fetches token weights into a local ring buffer based on predictive token sequencing.
	* Benefit: Zero-latency embedding lookups even at maximum sequence lengths.



	### 3. GQA Scaling
	Optimized Grouped-Query Attention ratios (4:1 for 7B, 8:1 for 70B) to ensure the KV-cache remains manageable during long-context operations without degrading reasoning quality.



	---

	## ⚖️ Legal & Licensing Notice

	NOTICE: PATENT PENDING

	This repository contains proprietary technology owned by Konstantin Vladimirovich Grabko. Access is granted under the following conditions:

	1. Commercial Use: Requires a 5% Net Revenue royalty agreement.
	2. IP Protection: No reverse engineering of SWA kernels or BRE routing logic is permitted.
	3. No "Patent-Around": Licensees agree not to file IP claims based on the methods described herein.
	4. Attribution: Any derivative work must cite:
	"Powered by CMS Manhattan JiRack Technology. Inventor: Konstantin Vladimirovich Grabko."

	Refer to `license_dense.md` for full legal documentation.

	---

	## 📦 Setup & Training

	### Environment
	* Framework: PyTorch 2.3+
	* Accelerator: AMD ROCm 6.0+ or NVIDIA CUDA 12.1+
	* Distributed: Integrated support for DeepSpeed ZeRO-2/3.

	### Quick Start
	To initialize a model from the factory:
	```python
	from JiRack_Dense_Factory import get_jirack_config, JiRackPyTorch

	# Initialize 13B configuration
	config = get_jirack_config("13b")
	model = JiRackPyTorch(config)