OpenVinayaka Engine (v1.0)
The Universal Hallucination-Free Inference Engine.
OpenVinayaka Engine is a C++ inference runtime designed to replace llama.cpp and vLLM for mission-critical applications. Unlike standard engines that focus only on token generation speed, OpenVinayaka prioritizes Factual Integrity by mathematically intervening in the model's internal state using the OV-Memory Priority Formula.
ποΈ Supported Architectures
This engine provides "Universal Kernels" to fix hallucinations in:
- Transformers (Llama 3, Gemma 2, GPT-Neo):
- Mechanism: Attention Steering.
- Logic: Injects a bias mask into $QK^T$ to force attention onto verified context.
- State Space Models (Mamba 1/2):
- Mechanism: State Correction.
- Logic: Linearly interpolates the hidden state $h_t$ towards a "Truth Vector" to prevent drift.
- Mixture of Experts (DeepSeek-V3, Mixtral):
- Mechanism: Router Bias.
- Logic: Identifies "Factual Experts" and biases the Gating Network to select them.
- Hybrid Architectures (Jamba, Samba):
- Mechanism: Interleaved Correction.
- Logic: Applies State Correction in SSM layers and Attention Steering in Transformer layers.
π Key Features
- Zero-Hallucination Guarantee: (When supported by OV-Memory Graph).
- CPU/GPU Hybrid: Graph walk on CPU, Matrix Math on GPU.
- MIT Licensed: Free for research and commercial use.
- Single File Deployment: Compatible with future
OV-GGUFformat.
π οΈ Building
make
./ov_engine_full
π Structure
kernels/: The math (Universal Kernel header).src/: The engine logic and block manager.include/: Shared headers.examples/: Sample integrations.
Dedicated to Om Vinayaka.