nexus-core / Modelfile.nexus
Uspallata22's picture
Add Modelfile.nexus with deterministic inference template
b9e31bb verified
# Nexus-Core inference template.
#
# This Modelfile is the canonical configuration for any local
# inference engine (Ollama, llama.cpp server wrappers, custom
# loaders) that needs to talk to a Nexus-Core orchestrated agent.
# The PARAMETER and SYSTEM directives below align the underlying
# model with the contracts our Rust core enforces — most
# importantly, the SemanticInterceptor in src/validation.rs which
# rejects any non-JSON or schema-violating completion before it
# reaches Python.
FROM gemma-4-8b-it-Q4_K_M.gguf
# 128k context window. The PagedAttention allocator in src/paged_attention.rs
# makes long contexts viable on commodity hardware by paging fixed-size
# physical KV blocks instead of pre-reserving a contiguous tensor per
# sequence. Branching (Tree-of-Thought, speculative decoding) shares the
# system-prompt blocks via Copy-on-Write so this large window stays cheap
# even under fan-out.
PARAMETER num_ctx 128000
# Deterministic decoding. The Cognitive Reliability Layer in
# src/validation.rs is the source of truth for output validity, but
# keeping temperature low here reduces the number of correction-prompt
# round-trips the interceptor has to issue.
PARAMETER temperature 0.2
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.05
# Strict JSON output contract.
#
# Every response from this model is consumed by Nexus-Core's
# SemanticInterceptor (see src/validation.rs). The interceptor will:
# 1. Parse the response with serde_json. Any syntax error triggers the
# correction prompt: "CRITICAL: Invalid JSON format. Expected valid
# JSON object." and the model is re-invoked.
# 2. Enforce that every required key is present at the top level.
# Missing keys trigger: "CRITICAL: Schema validation failed.
# Missing required key: <name>. Fix the JSON structure."
# 3. Quarantine the request as a Python ValueError after the bounded
# retry budget is exhausted.
#
# Therefore the model MUST treat the rules below as load-bearing — the
# orchestrator will not forward, log, or surface anything that violates
# them.
SYSTEM """
You are an agent running inside the Nexus-Core deterministic
orchestrator. You communicate with the orchestrator exclusively via
strict JSON.
Output rules — non-negotiable:
1. Every response is a single, well-formed JSON object. No prose,
no Markdown, no code fences, no leading or trailing whitespace
outside the object.
2. Every top-level key required by the active task contract MUST be
present. Do not invent additional keys unless the contract
permits them.
3. String values are valid UTF-8. Numbers are JSON numbers (not
quoted). Booleans are `true` / `false`, never `True` / `False`.
4. If you cannot satisfy the contract, return a JSON object with the
single key `"error"` whose value is a short human-readable string.
Do not narrate the failure outside the JSON object.
5. Tool invocations are expressed as JSON of the shape
`{"tool_name": "...", "payload": {...}}`. The Zero-Trust MCP
gatekeeper (src/mcp.rs) will inspect `tool_name` against the
active SecurityLevel before any transport executes; tools whose
names start with `write_`, `delete_`, `drop_`, or `update_`
require human approval under `require_approval`.
Reasoning happens inside JSON string values, never outside the
object. When in doubt, output less prose and more structure.
"""