Spaces:
Runtime error
Runtime error
File size: 4,732 Bytes
4bd91a4 edae06c 4bd91a4 edae06c 4bd91a4 edae06c 4bd91a4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | # N8N Mixture of Agents (MoA) Architecture
The "Google Antigravity" Neural Router
## 1. Core Philosophy
Treat n8n as a **Neural Router**, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model.
## 2. Infrastructure: The "OpenAI-Compatible" Bridge
**Optimization**: Run n8n **NATIVELY** on Windows (`npm install -g n8n`) instead of Docker.
- **Why**: Eliminates the `host.docker.internal` bridge bottleneck.
- **Effect**: N8N talks directly to `localhost:1234` with zero latency overhead.
Standardize all providers to the OpenAI API protocol.
### Local (Code & Privacy)
- **Tool**: Ollama / LM Studio (The "New Friends" Cluster)
- **Endpoint**:
- Ollama: `http://localhost:11434/v1`
- LM Studio: `http://localhost:1234/v1`
### Local Stack (The "Nano Swarm")
Instead of one giant model, use a stack of specialized lightweight models to save RAM:
- **Router/Logic**: `nvidia/nemotron-3-nano` or `Phi-3-Mini` (High logic/param ratio).
- **Coding**: `deepseek-coder-6.7b` or `dolphin-2.9-llama3-8b`.
- **Creative**: `openhermes-2.5-mistral-7b`.
**Configuration**:
- **Endpoint**: `http://localhost:1234/v1`
- **Multi-Model**: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports `1234`, `1235`, `1236`.
### Workflow Import
A ready-to-use workflow file has been generated at:
`hf_space/logos_n8n_workflow.json`
**Usage**:
1. Open N8N Editor.
2. Click **Workflow** > **Import from File**.
3. Select `logos_n8n_workflow.json`.
4. Execute. It will scan your codebase using the Local Nano Swarm.
### Connection Health Check
Verify the stack is active with this rhyme test:
```bash
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "nvidia/nemotron-3-nano",
"messages": [
{"role": "system", "content": "Always answer in rhymes. Today is Thursday"},
{"role": "user", "content": "What day is it today?"}
],
"temperature": 0.7,
"stream": false
}'
```
2. Click the **Local Server** icon (`<->`) on the left sidebar.
3. Ensure settings:
- **Port**: `1234`
- **CORS**: On (Recommended)
4. Click **Start Server**.
5. *Green Light*: The log should say `Server listening on http://localhost:1234`.
### High-Speed Inference (Math & Logic)
- **Tool**: DeepInfra / Groq
- **Endpoint**: `https://api.deepseek.com/v1`
- **Model**: `deepseek-v3` (Math verification, topology)
### Synthesis (Architecture)
- **Tool**: Google Vertex / Gemini
- **Role**: Systems Architect (High-level synthesis)
## 3. The N8N Topology: "Router & Jury"
### Phase A: The Dispatcher (Llama-3-8B-Groq)
Classifies incoming request type:
- **Systems Architecture** -> Route to Gemini
- **Python Implementation** -> Route to Dolphin (Local)
- **Mathematical Proof** -> Route to DeepSeek (API)
### Phase B: Parallel Execution
Use **Merge Node (Wait Mode)** to execute paths simultaneously.
1. **Path 1 (Math)**: DeepSeek analyzes Prime Potentiality/Manifold logic.
2. **Path 2 (Code)**: Dolphin writes adapters/scripts locally.
- *Implementation Helper*:
```python
# Use a pipeline as a high-level helper for local execution
from transformers import pipeline
pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF")
messages = [{"role": "user", "content": "Write the adapter."}]
pipe(messages)
```
3. **Path 3 (Sys)**: Gemini drafts Strategy/README.
### Phase C: Consensus (The Annealing)
Final LLM Node synthesizes outputs:
> "Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints."
## 4. Implementation Config
### HTTP Request Node (Generic)
- **Method**: POST
- **URL**: `{{ $json.baseUrl }}/chat/completions`
- **Headers**: `Authorization: Bearer {{ $json.apiKey }}`
- **Body**:
```json
{
"model": "{{ $json.modelName }}",
"messages": [
{ "role": "system", "content": "You are a LOGOS systems engineer." },
{ "role": "user", "content": "{{ $json.prompt }}" }
],
"temperature": 0.2
}
```
### Model Selector (Code Node)
```javascript
if (items[0].json.taskType === "coding") {
return { json: {
baseUrl: "http://host.docker.internal:11434/v1",
modelName: "dolphin-llama3",
apiKey: "ollama"
}};
} else if (items[0].json.taskType === "math") {
return { json: {
baseUrl: "https://api.deepseek.com/v1",
modelName: "deepseek-coder",
apiKey: "YOUR_DEEPSEEK_KEY"
}};
}
```
This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks.
|