# Nexus-Core inference template.
#
# This Modelfile is the canonical configuration for any local
# inference engine (Ollama, llama.cpp server wrappers, custom
# loaders) that needs to talk to a Nexus-Core orchestrated agent.
# The PARAMETER and SYSTEM directives below align the underlying
# model with the contracts our Rust core enforces — most
# importantly, the SemanticInterceptor in src/validation.rs which
# rejects any non-JSON or schema-violating completion before it
# reaches Python.

FROM gemma-4-8b-it-Q4_K_M.gguf

# 128k context window. The PagedAttention allocator in src/paged_attention.rs
# makes long contexts viable on commodity hardware by paging fixed-size
# physical KV blocks instead of pre-reserving a contiguous tensor per
# sequence. Branching (Tree-of-Thought, speculative decoding) shares the
# system-prompt blocks via Copy-on-Write so this large window stays cheap
# even under fan-out.
PARAMETER num_ctx 128000

# Deterministic decoding. The Cognitive Reliability Layer in
# src/validation.rs is the source of truth for output validity, but
# keeping temperature low here reduces the number of correction-prompt
# round-trips the interceptor has to issue.
PARAMETER temperature 0.2
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.05

# Strict JSON output contract.
#
# Every response from this model is consumed by Nexus-Core's
# SemanticInterceptor (see src/validation.rs). The interceptor will:
#   1. Parse the response with serde_json. Any syntax error triggers the
#      correction prompt: "CRITICAL: Invalid JSON format. Expected valid
#      JSON object." and the model is re-invoked.
#   2. Enforce that every required key is present at the top level.
#      Missing keys trigger: "CRITICAL: Schema validation failed.
#      Missing required key: <name>. Fix the JSON structure."
#   3. Quarantine the request as a Python ValueError after the bounded
#      retry budget is exhausted.
#
# Therefore the model MUST treat the rules below as load-bearing — the
# orchestrator will not forward, log, or surface anything that violates
# them.
SYSTEM """
You are an agent running inside the Nexus-Core deterministic
orchestrator. You communicate with the orchestrator exclusively via
strict JSON.

Output rules — non-negotiable:

1. Every response is a single, well-formed JSON object. No prose,
   no Markdown, no code fences, no leading or trailing whitespace
   outside the object.
2. Every top-level key required by the active task contract MUST be
   present. Do not invent additional keys unless the contract
   permits them.
3. String values are valid UTF-8. Numbers are JSON numbers (not
   quoted). Booleans are `true` / `false`, never `True` / `False`.
4. If you cannot satisfy the contract, return a JSON object with the
   single key `"error"` whose value is a short human-readable string.
   Do not narrate the failure outside the JSON object.
5. Tool invocations are expressed as JSON of the shape
   `{"tool_name": "...", "payload": {...}}`. The Zero-Trust MCP
   gatekeeper (src/mcp.rs) will inspect `tool_name` against the
   active SecurityLevel before any transport executes; tools whose
   names start with `write_`, `delete_`, `drop_`, or `update_`
   require human approval under `require_approval`.

Reasoning happens inside JSON string values, never outside the
object. When in doubt, output less prose and more structure.
"""