Spaces:

ANXLOG
/

LOGOS-SPCW-Matroska

Runtime error

App Files Files Community

LOGOS-SPCW-Matroska / N8N_ARCHITECTURE.md

GitHub Copilot

Protocol 22: Update HF Inference to Router endpoint

edae06c about 1 month ago

preview code

raw

history blame contribute delete

4.73 kB

	# N8N Mixture of Agents (MoA) Architecture
	The "Google Antigravity" Neural Router

	## 1. Core Philosophy
	Treat n8n as a Neural Router, decoupling "Thinking" (Logic/Architecture) from "Inference" (Execution/Code). This bypasses latencies and refusals by routing tasks to the most efficient model.

	## 2. Infrastructure: The "OpenAI-Compatible" Bridge
	Optimization: Run n8n NATIVELY on Windows (`npm install -g n8n`) instead of Docker.
	- Why: Eliminates the `host.docker.internal` bridge bottleneck.
	- Effect: N8N talks directly to `localhost:1234` with zero latency overhead.

	Standardize all providers to the OpenAI API protocol.

	### Local (Code & Privacy)
	- Tool: Ollama / LM Studio (The "New Friends" Cluster)
	- Endpoint:
	- Ollama: `http://localhost:11434/v1`
	- LM Studio: `http://localhost:1234/v1`
	### Local Stack (The "Nano Swarm")
	Instead of one giant model, use a stack of specialized lightweight models to save RAM:
	- Router/Logic: `nvidia/nemotron-3-nano` or `Phi-3-Mini` (High logic/param ratio).
	- Coding: `deepseek-coder-6.7b` or `dolphin-2.9-llama3-8b`.
	- Creative: `openhermes-2.5-mistral-7b`.

	Configuration:
	- Endpoint: `http://localhost:1234/v1`
	- Multi-Model: If using LM Studio, load the specific model needed for the batch, or run multiple instances on ports `1234`, `1235`, `1236`.

	### Workflow Import
	A ready-to-use workflow file has been generated at:
	`hf_space/logos_n8n_workflow.json`

	Usage:
	1. Open N8N Editor.
	2. Click Workflow > Import from File.
	3. Select `logos_n8n_workflow.json`.
	4. Execute. It will scan your codebase using the Local Nano Swarm.

	### Connection Health Check
	Verify the stack is active with this rhyme test:
	```bash
	curl http://localhost:1234/v1/chat/completions \
	-H "Content-Type: application/json" \
	-d '{
	"model": "nvidia/nemotron-3-nano",
	"messages": [
	{"role": "system", "content": "Always answer in rhymes. Today is Thursday"},
	{"role": "user", "content": "What day is it today?"}
	],
	"temperature": 0.7,
	"stream": false
	}'
	```
	2. Click the Local Server icon (`<->`) on the left sidebar.
	3. Ensure settings:
	- Port: `1234`
	- CORS: On (Recommended)
	4. Click Start Server.
	5. Green Light: The log should say `Server listening on http://localhost:1234`.

	### High-Speed Inference (Math & Logic)
	- Tool: DeepInfra / Groq
	- Endpoint: `https://api.deepseek.com/v1`
	- Model: `deepseek-v3` (Math verification, topology)

	### Synthesis (Architecture)
	- Tool: Google Vertex / Gemini
	- Role: Systems Architect (High-level synthesis)

	## 3. The N8N Topology: "Router & Jury"

	### Phase A: The Dispatcher (Llama-3-8B-Groq)
	Classifies incoming request type:
	- Systems Architecture -> Route to Gemini
	- Python Implementation -> Route to Dolphin (Local)
	- Mathematical Proof -> Route to DeepSeek (API)

	### Phase B: Parallel Execution
	Use Merge Node (Wait Mode) to execute paths simultaneously.
	1. Path 1 (Math): DeepSeek analyzes Prime Potentiality/Manifold logic.
	2. Path 2 (Code): Dolphin writes adapters/scripts locally.
	- Implementation Helper:
	```python
	# Use a pipeline as a high-level helper for local execution
	from transformers import pipeline
	pipe = pipeline("text-generation", model="dphn/Dolphin-X1-8B-GGUF")
	messages = [{"role": "user", "content": "Write the adapter."}]
	pipe(messages)
	```
	3. Path 3 (Sys): Gemini drafts Strategy/README.

	### Phase C: Consensus (The Annealing)
	Final LLM Node synthesizes outputs:
	> "Synthesize perspectives. If Dolphin's code conflicts with DeepSeek's math, prioritize DeepSeek constraints."

	## 4. Implementation Config

	### HTTP Request Node (Generic)
	- Method: POST
	- URL: `{{ $json.baseUrl }}/chat/completions`
	- Headers: `Authorization: Bearer {{ $json.apiKey }}`
	- Body:
	```json
	{
	"model": "{{ $json.modelName }}",
	"messages": [
	{ "role": "system", "content": "You are a LOGOS systems engineer." },
	{ "role": "user", "content": "{{ $json.prompt }}" }
	],
	"temperature": 0.2
	}
	```

	### Model Selector (Code Node)
	```javascript
	if (items[0].json.taskType === "coding") {
	return { json: {
	baseUrl: "http://host.docker.internal:11434/v1",
	modelName: "dolphin-llama3",
	apiKey: "ollama"
	}};
	} else if (items[0].json.taskType === "math") {
	return { json: {
	baseUrl: "https://api.deepseek.com/v1",
	modelName: "deepseek-coder",
	apiKey: "YOUR_DEEPSEEK_KEY"
	}};
	}
	```

	This architecture breaks the bottleneck by using Dolphin for grunt work (local/free) and specialized models for high-IQ tasks.