Spaces:

ANXLOG
/

LOGOS-SPCW-Matroska

Runtime error

App Files Files Community

LOGOS-SPCW-Matroska / N8N_ARCHITECTURE.md

GitHub Copilot

Protocol 22: Update HF Inference to Router endpoint

edae06c about 1 month ago

preview code

raw

history blame contribute delete

4.73 kB

N8N Mixture of Agents (MoA) Architecture

The "Google Antigravity" Neural Router

1. Core Philosophy

Treat n8n as a Neural Router, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model.

2. Infrastructure: The "OpenAI-Compatible" Bridge

Optimization: Run n8n NATIVELY on Windows (npm install -g n8n) instead of Docker.

Why: Eliminates the host.docker.internal bridge bottleneck.
Effect: N8N talks directly to localhost:1234 with zero latency overhead.

Standardize all providers to the OpenAI API protocol.

Local (Code & Privacy)

Tool: Ollama / LM Studio (The "New Friends" Cluster)
Endpoint:
- Ollama: http://localhost:11434/v1
- LM Studio: http://localhost:1234/v1

Local Stack (The "Nano Swarm")

Instead of one giant model, use a stack of specialized lightweight models to save RAM:

Router/Logic: nvidia/nemotron-3-nano or Phi-3-Mini (High logic/param ratio).
Coding: deepseek-coder-6.7b or dolphin-2.9-llama3-8b.
Creative: openhermes-2.5-mistral-7b.

Configuration:

Endpoint: http://localhost:1234/v1
Multi-Model: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports 1234, 1235, 1236.

Workflow Import

A ready-to-use workflow file has been generated at: hf_space/logos_n8n_workflow.json

Usage:

Open N8N Editor.
Click Workflow > Import from File.
Select logos_n8n_workflow.json.
Execute. It will scan your codebase using the Local Nano Swarm.

Connection Health Check

Verify the stack is active with this rhyme test:

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/nemotron-3-nano",
    "messages": [
        {"role": "system", "content": "Always answer in rhymes. Today is Thursday"},
        {"role": "user", "content": "What day is it today?"}
    ],
    "temperature": 0.7,
    "stream": false
}'

Click the Local Server icon (<->) on the left sidebar.
Ensure settings:
- Port: 1234
- CORS: On (Recommended)
Click Start Server.
Green Light: The log should say Server listening on http://localhost:1234.

High-Speed Inference (Math & Logic)

Tool: DeepInfra / Groq
Endpoint: https://api.deepseek.com/v1
Model: deepseek-v3 (Math verification, topology)

Synthesis (Architecture)

Tool: Google Vertex / Gemini
Role: Systems Architect (High-level synthesis)

3. The N8N Topology: "Router & Jury"

Phase A: The Dispatcher (Llama-3-8B-Groq)

Classifies incoming request type:

Systems Architecture -> Route to Gemini
Python Implementation -> Route to Dolphin (Local)
Mathematical Proof -> Route to DeepSeek (API)

Phase B: Parallel Execution

Use Merge Node (Wait Mode) to execute paths simultaneously.

Path 1 (Math): DeepSeek analyzes Prime Potentiality/Manifold logic.

Path 2 (Code): Dolphin writes adapters/scripts locally.

Implementation Helper:

# Use a pipeline as a high-level helper for local execution
from transformers import pipeline
pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF")
messages = [{"role": "user", "content": "Write the adapter."}]
pipe(messages)

Path 3 (Sys): Gemini drafts Strategy/README.

Phase C: Consensus (The Annealing)

Final LLM Node synthesizes outputs:

"Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints."

4. Implementation Config

HTTP Request Node (Generic)

Method: POST
URL: {{ $json.baseUrl }}/chat/completions
Headers: Authorization: Bearer {{ $json.apiKey }}
Body:

{
  "model": "{{ $json.modelName }}",
  "messages": [
    { "role": "system", "content": "You are a LOGOS systems engineer." },
    { "role": "user", "content": "{{ $json.prompt }}" }
  ],
  "temperature": 0.2
}

Model Selector (Code Node)

if (items[0].json.taskType === "coding") {
  return { json: {
      baseUrl: "http://host.docker.internal:11434/v1",
      modelName: "dolphin-llama3",
      apiKey: "ollama"
  }};
} else if (items[0].json.taskType === "math") {
  return { json: {
      baseUrl: "https://api.deepseek.com/v1",
      modelName: "deepseek-coder",
      apiKey: "YOUR_DEEPSEEK_KEY"
  }};
}

This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks.