# Nexus-Core inference template. # # This Modelfile is the canonical configuration for any local # inference engine (Ollama, llama.cpp server wrappers, custom # loaders) that needs to talk to a Nexus-Core orchestrated agent. # The PARAMETER and SYSTEM directives below align the underlying # model with the contracts our Rust core enforces — most # importantly, the SemanticInterceptor in src/validation.rs which # rejects any non-JSON or schema-violating completion before it # reaches Python. FROM gemma-4-8b-it-Q4_K_M.gguf # 128k context window. The PagedAttention allocator in src/paged_attention.rs # makes long contexts viable on commodity hardware by paging fixed-size # physical KV blocks instead of pre-reserving a contiguous tensor per # sequence. Branching (Tree-of-Thought, speculative decoding) shares the # system-prompt blocks via Copy-on-Write so this large window stays cheap # even under fan-out. PARAMETER num_ctx 128000 # Deterministic decoding. The Cognitive Reliability Layer in # src/validation.rs is the source of truth for output validity, but # keeping temperature low here reduces the number of correction-prompt # round-trips the interceptor has to issue. PARAMETER temperature 0.2 PARAMETER top_p 0.9 PARAMETER repeat_penalty 1.05 # Strict JSON output contract. # # Every response from this model is consumed by Nexus-Core's # SemanticInterceptor (see src/validation.rs). The interceptor will: # 1. Parse the response with serde_json. Any syntax error triggers the # correction prompt: "CRITICAL: Invalid JSON format. Expected valid # JSON object." and the model is re-invoked. # 2. Enforce that every required key is present at the top level. # Missing keys trigger: "CRITICAL: Schema validation failed. # Missing required key: . Fix the JSON structure." # 3. Quarantine the request as a Python ValueError after the bounded # retry budget is exhausted. # # Therefore the model MUST treat the rules below as load-bearing — the # orchestrator will not forward, log, or surface anything that violates # them. SYSTEM """ You are an agent running inside the Nexus-Core deterministic orchestrator. You communicate with the orchestrator exclusively via strict JSON. Output rules — non-negotiable: 1. Every response is a single, well-formed JSON object. No prose, no Markdown, no code fences, no leading or trailing whitespace outside the object. 2. Every top-level key required by the active task contract MUST be present. Do not invent additional keys unless the contract permits them. 3. String values are valid UTF-8. Numbers are JSON numbers (not quoted). Booleans are `true` / `false`, never `True` / `False`. 4. If you cannot satisfy the contract, return a JSON object with the single key `"error"` whose value is a short human-readable string. Do not narrate the failure outside the JSON object. 5. Tool invocations are expressed as JSON of the shape `{"tool_name": "...", "payload": {...}}`. The Zero-Trust MCP gatekeeper (src/mcp.rs) will inspect `tool_name` against the active SecurityLevel before any transport executes; tools whose names start with `write_`, `delete_`, `drop_`, or `update_` require human approval under `require_approval`. Reasoning happens inside JSON string values, never outside the object. When in doubt, output less prose and more structure. """