Spaces:

ANXLOG
/

LOGOS-SPCW-Matroska

Runtime error

File size: 4,732 Bytes

# N8N Mixture of Agents (MoA) Architecture
The "Google Antigravity" Neural Router

## 1. Core Philosophy
Treat n8n as a **Neural Router**, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model.

## 2. Infrastructure: The "OpenAI-Compatible" Bridge
**Optimization**: Run n8n **NATIVELY** on Windows (`npm install -g n8n`) instead of Docker.
- **Why**: Eliminates the `host.docker.internal` bridge bottleneck.
- **Effect**: N8N talks directly to `localhost:1234` with zero latency overhead.

Standardize all providers to the OpenAI API protocol.

### Local (Code & Privacy)
- **Tool**: Ollama / LM Studio (The "New Friends" Cluster)
- **Endpoint**: 
    - Ollama: `http://localhost:11434/v1`
    - LM Studio: `http://localhost:1234/v1`
### Local Stack (The "Nano Swarm")
Instead of one giant model, use a stack of specialized lightweight models to save RAM:
-   **Router/Logic**: `nvidia/nemotron-3-nano` or `Phi-3-Mini` (High logic/param ratio).
-   **Coding**: `deepseek-coder-6.7b` or `dolphin-2.9-llama3-8b`.
-   **Creative**: `openhermes-2.5-mistral-7b`.

**Configuration**:
-   **Endpoint**: `http://localhost:1234/v1`
-   **Multi-Model**: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports `1234`, `1235`, `1236`.

### Workflow Import
A ready-to-use workflow file has been generated at:
`hf_space/logos_n8n_workflow.json`

**Usage**:
1.  Open N8N Editor.
2.  Click **Workflow** > **Import from File**.
3.  Select `logos_n8n_workflow.json`.
4.  Execute. It will scan your codebase using the Local Nano Swarm.

### Connection Health Check
Verify the stack is active with this rhyme test:
```bash
curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/nemotron-3-nano",
    "messages": [
        {"role": "system", "content": "Always answer in rhymes. Today is Thursday"},
        {"role": "user", "content": "What day is it today?"}
    ],
    "temperature": 0.7,
    "stream": false
}'
```
2.  Click the **Local Server** icon (`<->`) on the left sidebar.
3.  Ensure settings:
    -   **Port**: `1234`
    -   **CORS**: On (Recommended)
4.  Click **Start Server**.
5.  *Green Light*: The log should say `Server listening on http://localhost:1234`.

### High-Speed Inference (Math & Logic)
- **Tool**: DeepInfra / Groq
- **Endpoint**: `https://api.deepseek.com/v1`
- **Model**: `deepseek-v3` (Math verification, topology)

### Synthesis (Architecture)
- **Tool**: Google Vertex / Gemini
- **Role**: Systems Architect (High-level synthesis)

## 3. The N8N Topology: "Router & Jury"

### Phase A: The Dispatcher (Llama-3-8B-Groq)
Classifies incoming request type:
- **Systems Architecture** -> Route to Gemini
- **Python Implementation** -> Route to Dolphin (Local)
- **Mathematical Proof** -> Route to DeepSeek (API)

### Phase B: Parallel Execution
Use **Merge Node (Wait Mode)** to execute paths simultaneously.
1.  **Path 1 (Math)**: DeepSeek analyzes Prime Potentiality/Manifold logic.
2.  **Path 2 (Code)**: Dolphin writes adapters/scripts locally.
    -   *Implementation Helper*:
        ```python
        # Use a pipeline as a high-level helper for local execution
        from transformers import pipeline
        pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF")
        messages = [{"role": "user", "content": "Write the adapter."}]
        pipe(messages)
        ```
3.  **Path 3 (Sys)**: Gemini drafts Strategy/README.

### Phase C: Consensus (The Annealing)
Final LLM Node synthesizes outputs:
> "Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints."

## 4. Implementation Config

### HTTP Request Node (Generic)
- **Method**: POST
- **URL**: `{{ $json.baseUrl }}/chat/completions`
- **Headers**: `Authorization: Bearer {{ $json.apiKey }}`
- **Body**:
```json
{
  "model": "{{ $json.modelName }}",
  "messages": [
    { "role": "system", "content": "You are a LOGOS systems engineer." },
    { "role": "user", "content": "{{ $json.prompt }}" }
  ],
  "temperature": 0.2
}
```

### Model Selector (Code Node)
```javascript
if (items[0].json.taskType === "coding") {
  return { json: {
      baseUrl: "http://host.docker.internal:11434/v1",
      modelName: "dolphin-llama3",
      apiKey: "ollama"
  }};
} else if (items[0].json.taskType === "math") {
  return { json: {
      baseUrl: "https://api.deepseek.com/v1",
      modelName: "deepseek-coder",
      apiKey: "YOUR_DEEPSEEK_KEY"
  }};
}
```

This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks.