Proprietary Invention Package – Ternary-Quantized Transformer Optimization
Under training yet: DEV
Inventor: Konstantin Vladimirovich Grabko
Email: grabko@cmsmanhattan.com
Date: December 21, 2025
Needed: A sponsor for Llama 405b Distilation to high quality as original Llama 405b or Data Center with cooperation
Overview: This package contains documentation for a novel, proprietary method enabling efficient LLM inference on AMD ROCm hardware using ternary quantization, BRE, and SWA fusion.
Contents:
- license.md
- NDA.md
- invention_description.md
- claims.md
- performance_data.md
- [Diagrams and attachments]
Confidential: All materials are proprietary. Contact inventor for licensing discussions.
⚠️ IMPORTANT NOTICE — PROPRIETARY TECHNOLOGY
This model and all accompanying code, algorithms, and documentation are proprietary technology owned by Konstantin Vladimirovich Grabko.
© 2025 Konstantin Vladimirovich Grabko. All Rights Reserved. Patent Pending.
Allowed:
- Personal and non-commercial research use only
Strictly Prohibited without a written commercial license:
- Any commercial use (SaaS, mobile apps, edge devices, paid services, etc.)
- Creating and distributing derivative models for profit
- Removing or modifying any copyright or legal notices
- Patenting any part of this technology
Commercial users must obtain a signed license and pay 5% royalty on net revenue.
Any unauthorized commercial use will be pursued legally under New York law.
Contact for commercial license: grabko@cmsmanhattan.com
Benefits for the JiRack 405B Project
VRAM Efficiency
✅ Easy Fine-tuning a 405B model usually requires massive resources, but with LoRA and your 70% VRAM reduction, users can fine-tune on consumer-grade multi-GPU setups.
Trainable Parameters:
- Base model (frozen): 405B parameters @ 2-bit = ~108 GB
- LoRA adapters (r=16): ~50M parameters @ FP32 = ~200 MB
- Total VRAM: ~108 GB (fits on 4x RTX 4090 with offloading)
Thermal Stability
✅ Since only a fraction of parameters are updated, the thermal footprint remains consistent with your SWA Fusion goals of staying < 80°C.
JiRack Ternary 405B on BitNet layers with meta-llama/Llama-3.2-405B compatable tokenizer
It supports safe safetensors format to use
- Downloads last month
- 223
Model tree for kgrabko/JiRackTernary_405b
Base model
meta-llama/Llama-3.1-405B