Spaces:
Runtime error
Runtime error
| # N8N Mixture of Agents (MoA) Architecture | |
| The "Google Antigravity" Neural Router | |
| ## 1. Core Philosophy | |
| Treat n8n as a **Neural Router**, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model. | |
| ## 2. Infrastructure: The "OpenAI-Compatible" Bridge | |
| **Optimization**: Run n8n **NATIVELY** on Windows (`npm install -g n8n`) instead of Docker. | |
| - **Why**: Eliminates the `host.docker.internal` bridge bottleneck. | |
| - **Effect**: N8N talks directly to `localhost:1234` with zero latency overhead. | |
| Standardize all providers to the OpenAI API protocol. | |
| ### Local (Code & Privacy) | |
| - **Tool**: Ollama / LM Studio (The "New Friends" Cluster) | |
| - **Endpoint**: | |
| - Ollama: `http://localhost:11434/v1` | |
| - LM Studio: `http://localhost:1234/v1` | |
| ### Local Stack (The "Nano Swarm") | |
| Instead of one giant model, use a stack of specialized lightweight models to save RAM: | |
| - **Router/Logic**: `nvidia/nemotron-3-nano` or `Phi-3-Mini` (High logic/param ratio). | |
| - **Coding**: `deepseek-coder-6.7b` or `dolphin-2.9-llama3-8b`. | |
| - **Creative**: `openhermes-2.5-mistral-7b`. | |
| **Configuration**: | |
| - **Endpoint**: `http://localhost:1234/v1` | |
| - **Multi-Model**: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports `1234`, `1235`, `1236`. | |
| ### Workflow Import | |
| A ready-to-use workflow file has been generated at: | |
| `hf_space/logos_n8n_workflow.json` | |
| **Usage**: | |
| 1. Open N8N Editor. | |
| 2. Click **Workflow** > **Import from File**. | |
| 3. Select `logos_n8n_workflow.json`. | |
| 4. Execute. It will scan your codebase using the Local Nano Swarm. | |
| ### Connection Health Check | |
| Verify the stack is active with this rhyme test: | |
| ```bash | |
| curl http://localhost:1234/v1/chat/completions \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "model": "nvidia/nemotron-3-nano", | |
| "messages": [ | |
| {"role": "system", "content": "Always answer in rhymes. Today is Thursday"}, | |
| {"role": "user", "content": "What day is it today?"} | |
| ], | |
| "temperature": 0.7, | |
| "stream": false | |
| }' | |
| ``` | |
| 2. Click the **Local Server** icon (`<->`) on the left sidebar. | |
| 3. Ensure settings: | |
| - **Port**: `1234` | |
| - **CORS**: On (Recommended) | |
| 4. Click **Start Server**. | |
| 5. *Green Light*: The log should say `Server listening on http://localhost:1234`. | |
| ### High-Speed Inference (Math & Logic) | |
| - **Tool**: DeepInfra / Groq | |
| - **Endpoint**: `https://api.deepseek.com/v1` | |
| - **Model**: `deepseek-v3` (Math verification, topology) | |
| ### Synthesis (Architecture) | |
| - **Tool**: Google Vertex / Gemini | |
| - **Role**: Systems Architect (High-level synthesis) | |
| ## 3. The N8N Topology: "Router & Jury" | |
| ### Phase A: The Dispatcher (Llama-3-8B-Groq) | |
| Classifies incoming request type: | |
| - **Systems Architecture** -> Route to Gemini | |
| - **Python Implementation** -> Route to Dolphin (Local) | |
| - **Mathematical Proof** -> Route to DeepSeek (API) | |
| ### Phase B: Parallel Execution | |
| Use **Merge Node (Wait Mode)** to execute paths simultaneously. | |
| 1. **Path 1 (Math)**: DeepSeek analyzes Prime Potentiality/Manifold logic. | |
| 2. **Path 2 (Code)**: Dolphin writes adapters/scripts locally. | |
| - *Implementation Helper*: | |
| ```python | |
| # Use a pipeline as a high-level helper for local execution | |
| from transformers import pipeline | |
| pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF") | |
| messages = [{"role": "user", "content": "Write the adapter."}] | |
| pipe(messages) | |
| ``` | |
| 3. **Path 3 (Sys)**: Gemini drafts Strategy/README. | |
| ### Phase C: Consensus (The Annealing) | |
| Final LLM Node synthesizes outputs: | |
| > "Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints." | |
| ## 4. Implementation Config | |
| ### HTTP Request Node (Generic) | |
| - **Method**: POST | |
| - **URL**: `{{ $json.baseUrl }}/chat/completions` | |
| - **Headers**: `Authorization: Bearer {{ $json.apiKey }}` | |
| - **Body**: | |
| ```json | |
| { | |
| "model": "{{ $json.modelName }}", | |
| "messages": [ | |
| { "role": "system", "content": "You are a LOGOS systems engineer." }, | |
| { "role": "user", "content": "{{ $json.prompt }}" } | |
| ], | |
| "temperature": 0.2 | |
| } | |
| ``` | |
| ### Model Selector (Code Node) | |
| ```javascript | |
| if (items[0].json.taskType === "coding") { | |
| return { json: { | |
| baseUrl: "http://host.docker.internal:11434/v1", | |
| modelName: "dolphin-llama3", | |
| apiKey: "ollama" | |
| }}; | |
| } else if (items[0].json.taskType === "math") { | |
| return { json: { | |
| baseUrl: "https://api.deepseek.com/v1", | |
| modelName: "deepseek-coder", | |
| apiKey: "YOUR_DEEPSEEK_KEY" | |
| }}; | |
| } | |
| ``` | |
| This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks. | |