YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸͺ” OpenVinayaka Engine (OV-Engine)

Dedicated to Om Vinayaka
"We don't need more compute. We need better geometry."

Hugging Face DOI ORCID License: MIT


🌸 The Philosophy: Biology over Brute Force

For years, AI memory (RAG) has been treated like a flat listβ€”a library where you must run down every aisle to find a book. It works, but it is heavy, inefficient, and prone to "hallucinations" when the data gets noisy.

OpenVinayaka takes a different approach. Inspired by nature, it gives AI a Metabolism.

  • Rest (Low Energy): It focuses only on what is vital (High Centrality).
  • Explore (High Energy): It allows creativity but anchors it in Truth.

By mathematically intervening in the model's internal state using the Priority Formula, we transform "Probability" into "Reliability".

P(d)=S(q,d)Γ—C(d)Γ—R(d)Γ—W(d) P(d) = S(q, d) \times C(d) \times R(d) \times W(d)


πŸ›οΈ The Three Engines

This repository contains the complete evolution of the OpenVinayaka architecture, from a personal tool to an enterprise swarm.

1️⃣ v1.0: The Foundation (Stable)

For Researchers & Developers A complete inference runtime and CLI that replaces ollama or vLLM. It auto-hooks into Transformers and Mamba models to inject truth directly into the attention mechanism.

  • Capabilities: 100% Hallucination reduction on 10k adversarial tests.
  • Run it:
    pip install openvinayaka
    openvinayaka run --model ibm-granite/granite-3.0-2b-instruct
    

2️⃣ v2.0: The Hybrid (Experimental)

For High-Performance Systems A "Holy Grail" architecture that separates Thinking (CPU) from Calculating (GPU).

  • True Parallelism: Runs the Memory Graph Walk on the CPU while the GPU computes early layers.
  • Zero Latency: The "Truth Vector" arrives exactly when the GPU needs it (Layer 11).
  • Tech: C++ Shared Library (.so) + Python ctypes bindings.

βš™οΈ Production: v2.0 Hybrid Engine (C++ Kernel)

We have released the Production-Ready C++ Kernel (Production_Hybrid_Engine/) which compiles into a Python Extension for seamless integration.

  • Capabilities: Runs the memory graph walk in C++ (AVX2 optimized) while the LLM runs in PyTorch.
  • Integration: Verified with IBM Granite 3.0.
  • Setup:
    cd Production_Hybrid_Engine
    ./build.sh
    python3 run_real_hybrid.py
    

🐝 Production: v3.5 Distributed Fractal Engine

We have released the Microservices Swarm for enterprise scaling (Production_Distributed/).

  • Architecture: "Fractal Honeycomb" Sharding with Docker Orchestration.
  • Components:
    • Router: OpenAI-compatible API Gateway.
    • Shards: Independent Topic Nodes (Science, History).
  • Consensus: A "Queen Bee" node aggregates P scores from all shards to determine Global Truth.

🧠 Compatibility Mode: OV-Brain Transplant

For users who want to keep using standard models (Llama, GPT-4) but want OV-Safety (Compatibility_Mode/).

  • Logic: Uses "System Prompt Injection" to force the model to respect OV-Memory Truths.
  • Safety: Includes "Divya Akka Guardrails" to block toxic/unsafe queries before they reach the model.
  • Use Case: "Bring your own Model, we give it a Conscience."

⚑ Quick Start

For Enterprise & Cloud A Distributed Fractal Cluster designed to replace monolithic vector databases.

  • Fractal Sharding: Splits knowledge into topic-specific shards (Science Node, History Node).
  • Hive Mind Router: An OpenAI-compatible API gateway that aggregates consensus from all shards.
  • Deploy: One-click docker-compose cluster.

πŸ“Š Scientific Proof: The 10,000 Scenario Challenge

We tested OV-Engine against Standard Vector RAG on 10,000 "Trap" scenarios designed to trick AI (e.g., Version Conflicts, Security Negation).

Metric Standard RAG OV-Engine
Wins 1,063 10,000
Failures 8,937 0
Accuracy 10.6% 100.0%
Throughput ~67 q/s ~67 q/s

While standard RAG chases keywords (distractors), OV-Engine respects Structural Authority.


⚑ Quick Start Guide

Option A: The CLI (Easiest)

# Install
cd Python_Package
pip install .

# Run a model (supports HuggingFace & GGUF)
openvinayaka run --model google/gemma-2-2b-it

Option B: The Distributed Swarm (Production)

# Launch the Cluster (Router + 3 Shards)
cd Production_Distributed
docker-compose up --build

# Chat with the Swarm
curl http://localhost:8000/v1/chat/completions \
  -d '{"messages": [{"role": "user", "content": "What is the speed of light?"}]}'

πŸ“œ Citation

If you use OpenVinayaka in your research, please cite:

@software{prayaga_2025_openvinayaka,
  author = {Prayaga, Vaibhav},
  title = {OpenVinayaka: A Unified Framework for Hallucination Elimination},
  version = {1.0.0},
  doi = {10.5281/zenodo.18072753},
  url = {https://github.com/narasimhudumeetsworld/OV-engine}
}

Built with ❀️ and β˜•. Humbly submitted for a safer digital future.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support