Marai AI: The Sovereign Reasoning Orchestrator

Marai is a Reasoning-as-a-Service (RaaS) platform designed to upgrade existing language models into high-precision autonomous agents. It acts as an intelligent orchestration layer that prioritizes mathematical rigor, technical grounding, and verifiable decision-making.

πŸš€ The "BYOM" Force Multiplier

Marai is built on a Bring Your Own Model (BYOM) architecture. It is designed to be model-agnostic, acting as a "Reasoning Accelerator" for any paired Frontier model (GPT-4o, Claude 3.5, Llama 3.1 405B). By wrapping these models in Marai's sovereign architecture, users achieve higher precision and reliability than raw model inference alone.

Performance Baseline (Standard Local Pair)

Benchmark Marai Score Comparison
Galileo Agent v2 (Banking) 80.00% AC Beats GPT-4.1 (62%)
MMLU-Pro (Reasoning) 60.00% Beats Llama-3-70B (52.8%)
GSM8K (Arithmetic) 60.00% Standardized Math Quality

πŸ›‘οΈ Auditable Reasoning

Unlike "black-box" models, Marai is fully auditable. It enforces a multi-stage deliberative process where every decision is preceded by a transparent reasoning chain.

  • Verifiable Thinking: Every response includes a dedicated deliberation block, allowing users and evaluators to trace the logic used to arrive at a result.
  • Deterministic Guardrails: Marai cross-references model outputs with symbolic math and curated grounding sources to ensure factual and mathematical consistency.
  • Transparency Without Exposure: Marai provides full visibility into what the agent is thinking, without requiring users to expose their proprietary data or models to the core orchestrator.

πŸ—„οΈ Sovereign Shard Storage

Marai is packaged with a high-performance Local Sharding system (256 independent SQLite cells).

  • Privacy First: All long-term memory, technical grounding, and audit logs are stored locally.
  • Vertical Grounding: Users can ingest their own technical databases into these shards, allowing Marai to reason with specialized private data that never touches the cloud.
  • Zero-Latency Orchestration: The sharded architecture ensures $O(1)$ data retrieval, maintaining high performance even as your local knowledge base grows to a massive scale.

πŸ› οΈ Usage

Live Edge API (Cloudflare Workers AI)

Marai is deployed globally on Cloudflare's edge network. The API is OpenAI-compatible β€” drop it into any existing pipeline:

🌐 https://marai-inference.berwicksmith.workers.dev
import openai
client = openai.OpenAI(
    base_url="https://marai-inference.berwicksmith.workers.dev/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="marai-v1",
    messages=[{"role": "user", "content": "Solve: If f(x) = 3xΒ² + 2x - 5, find f'(2)"}]
)
print(response.choices[0].message.content)

Self-Hosted (Docker)

docker run -p 8000:8000 marai-core:latest

πŸ‘€ About the Creator

Marai was built by Berwick "BJ" Smith Jr. β€” a former railway engineer (CN Railway), Perscholas graduate, and full-stack Java developer from Houma, Louisiana.

BJ is not a traditional AI researcher, scientist, or PhD holder. Marai started as a hobby project β€” born from a simple curiosity about how AI systems think and a conviction that you don't need a lab to build something powerful. What began as tinkering with neural patterns evolved into a full sovereign reasoning architecture with 256-shard databases, symbolic math engines, and enterprise-grade agentic pipelines.

"I'm not from Silicon Valley. I'm from the bayou. I just like building cool things."

πŸ“œ Citation

If you use Marai in your research, please cite:

@article{smith2026marai,
  title={Marai: A Bring-Your-Own-Model Orchestration Layer for High-Fidelity Reasoning},
  author={Berwick Smith Jr.},
  year={2026},
  note={Independent Research β€” Marai USA}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support