# N8N Mixture of Agents (MoA) Architecture The "Google Antigravity" Neural Router ## 1. Core Philosophy Treat n8n as a **Neural Router**, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model. ## 2. Infrastructure: The "OpenAI-Compatible" Bridge **Optimization**: Run n8n **NATIVELY** on Windows (`npm install -g n8n`) instead of Docker. - **Why**: Eliminates the `host.docker.internal` bridge bottleneck. - **Effect**: N8N talks directly to `localhost:1234` with zero latency overhead. Standardize all providers to the OpenAI API protocol. ### Local (Code & Privacy) - **Tool**: Ollama / LM Studio (The "New Friends" Cluster) - **Endpoint**: - Ollama: `http://localhost:11434/v1` - LM Studio: `http://localhost:1234/v1` ### Local Stack (The "Nano Swarm") Instead of one giant model, use a stack of specialized lightweight models to save RAM: - **Router/Logic**: `nvidia/nemotron-3-nano` or `Phi-3-Mini` (High logic/param ratio). - **Coding**: `deepseek-coder-6.7b` or `dolphin-2.9-llama3-8b`. - **Creative**: `openhermes-2.5-mistral-7b`. **Configuration**: - **Endpoint**: `http://localhost:1234/v1` - **Multi-Model**: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports `1234`, `1235`, `1236`. ### Workflow Import A ready-to-use workflow file has been generated at: `hf_space/logos_n8n_workflow.json` **Usage**: 1. Open N8N Editor. 2. Click **Workflow** > **Import from File**. 3. Select `logos_n8n_workflow.json`. 4. Execute. It will scan your codebase using the Local Nano Swarm. ### Connection Health Check Verify the stack is active with this rhyme test: ```bash curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "nvidia/nemotron-3-nano", "messages": [ {"role": "system", "content": "Always answer in rhymes. Today is Thursday"}, {"role": "user", "content": "What day is it today?"} ], "temperature": 0.7, "stream": false }' ``` 2. Click the **Local Server** icon (`<->`) on the left sidebar. 3. Ensure settings: - **Port**: `1234` - **CORS**: On (Recommended) 4. Click **Start Server**. 5. *Green Light*: The log should say `Server listening on http://localhost:1234`. ### High-Speed Inference (Math & Logic) - **Tool**: DeepInfra / Groq - **Endpoint**: `https://api.deepseek.com/v1` - **Model**: `deepseek-v3` (Math verification, topology) ### Synthesis (Architecture) - **Tool**: Google Vertex / Gemini - **Role**: Systems Architect (High-level synthesis) ## 3. The N8N Topology: "Router & Jury" ### Phase A: The Dispatcher (Llama-3-8B-Groq) Classifies incoming request type: - **Systems Architecture** -> Route to Gemini - **Python Implementation** -> Route to Dolphin (Local) - **Mathematical Proof** -> Route to DeepSeek (API) ### Phase B: Parallel Execution Use **Merge Node (Wait Mode)** to execute paths simultaneously. 1. **Path 1 (Math)**: DeepSeek analyzes Prime Potentiality/Manifold logic. 2. **Path 2 (Code)**: Dolphin writes adapters/scripts locally. - *Implementation Helper*: ```python # Use a pipeline as a high-level helper for local execution from transformers import pipeline pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF") messages = [{"role": "user", "content": "Write the adapter."}] pipe(messages) ``` 3. **Path 3 (Sys)**: Gemini drafts Strategy/README. ### Phase C: Consensus (The Annealing) Final LLM Node synthesizes outputs: > "Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints." ## 4. Implementation Config ### HTTP Request Node (Generic) - **Method**: POST - **URL**: `{{ $json.baseUrl }}/chat/completions` - **Headers**: `Authorization: Bearer {{ $json.apiKey }}` - **Body**: ```json { "model": "{{ $json.modelName }}", "messages": [ { "role": "system", "content": "You are a LOGOS systems engineer." }, { "role": "user", "content": "{{ $json.prompt }}" } ], "temperature": 0.2 } ``` ### Model Selector (Code Node) ```javascript if (items[0].json.taskType === "coding") { return { json: { baseUrl: "http://host.docker.internal:11434/v1", modelName: "dolphin-llama3", apiKey: "ollama" }}; } else if (items[0].json.taskType === "math") { return { json: { baseUrl: "https://api.deepseek.com/v1", modelName: "deepseek-coder", apiKey: "YOUR_DEEPSEEK_KEY" }}; } ``` This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks.