Spaces:
Running on Zero
Running on Zero
File size: 20,155 Bytes
d796d00 66a1a95 d796d00 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 | # HearthNet β Architecture Reference
> **Local-first community AI mesh.** Each participant runs a node on their own hardware.
> Nodes discover each other automatically and share AI capabilities, files, and community
> posts β no central server required.
---
## High-Level Concept
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Community Mesh (LAN / overlay) β
β β
β βββββββββββββββ mDNS/UDP βββββββββββββββ mDNS/UDP β
β β Node A ββββββββββββββββββΊβ Node B ββββββββββββββββ β
β β (anchor) β β (hearth) β β
β β β capability β β β
β β CapBus βββββΌβββββbus.callββββΊββΊ CapBus β β
β β LLM svc β β RAG svc β β
β β RAG svc β β OCR svc β β
β β Gradio UI β β Gradio UI β β
β βββββββββββββββ βββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
HearthNet is structured around three ideas:
1. **Node** β a Python process on someone's hardware (Raspberry Pi, laptop, server).
2. **CapabilityBus** β a message bus where services register *capabilities* (e.g. `llm.chat@1.0`). Any code, local or remote, calls a capability by name.
3. **Services** β pure-Python objects that handle capability calls. A node installs whichever services its hardware supports.
---
## Module Map
### Phase 1 β Foundation
| Module | Location | What it does |
|--------|----------|-------------|
| **M01 Identity** | `hearthnet/identity/` | Ed25519 node keys, community manifests, invite tokens |
| **M02 Discovery** | `hearthnet/discovery/` | mDNS + UDP multicast peer discovery |
| **M03 Bus** | `hearthnet/bus/` | Capability router, health ring buffer, trust levels |
| **M04 LLM** | `hearthnet/services/llm/` | Local model backends (Ollama, llama.cpp, LM Studio, HF, Anthropic) |
| **M05 RAG** | `hearthnet/services/rag/` | Chunker β embedder β Chroma vector store + retrieval |
| **M06 Marketplace** | `hearthnet/services/marketplace/` | Event-sourced community board (posts, offers, requests) |
| **M07 Blobs** | `hearthnet/blobs/` | BLAKE3 content-addressed file store with chunked transfer |
| **M08 UI** | `hearthnet/ui/` | Gradio 8-tab interface + themes + topology component |
| **M09 Emergency** | `hearthnet/emergency/` | Async probe loop β emergency state machine |
| **M10 Chat** | `hearthnet/services/chat/` | Event-backed direct messages between nodes |
| **M11 Embedding** | `hearthnet/services/embedding/` | Sentence-transformer embeddings (BAAI/bge-small) |
| **M12 CLI** | `hearthnet/cli.py` | Click CLI: run, call, log, rag, invite, version, β¦ |
| **M13 Onboarding** | `hearthnet/ui/onboarding.py` | Invite QR flow + first-run wizard |
### Phase 2 β Resilience & Rich Services
| Module | Location | What it does |
|--------|----------|-------------|
| **M14 Federation** | `hearthnet/federation/` | Cross-community node manifests + signed bridges |
| **M15 Relay** | `hearthnet/relay/` | Public-IP relay tier for NAT traversal |
| **M16 Tokens** | `hearthnet/identity/tokens.py` | AuthToken / CapabilityToken scoped access |
| **M17 OCR** | `hearthnet/services/ocr/` | Tesseract / TrOCR text extraction |
| **M18 Translation** | `hearthnet/services/translation/` | NLLB-200 local translation |
| **M19 STT/TTS** | `hearthnet/services/stt_tts/` | Whisper STT + Coqui/pyttsx3 TTS |
| **M20 Vision** | `hearthnet/services/vision/` | Florence-2 image captioning / VQA |
| **M21 Tool Calls** | `hearthnet/services/tools/` | LLM tool-call executor (plant ID, search, β¦) |
| **M22 Mobile** | `hearthnet/ui/mobile/` | PWA manifest + service worker for home-screen install |
| **M23 E2E Encryption** | `hearthnet/crypto/` | X25519 ECDH + ChaCha20-Poly1305 channel encryption |
| **M24 Rerank** | `hearthnet/services/rerank/` | Cross-encoder reranking for RAG results |
| **M25 Group Chat** | `hearthnet/services/group_chat/` | Multi-party room-based chat |
### Phase 3 β Experimental (opt-in via `config.toml`)
| Module | Location | Flag | What it does |
|--------|----------|------|-------------|
| **M26 Distributed Inference** | `hearthnet/distributed_inference/` | `research.distributed_inference` | Layer-shard a 7B model across LAN nodes (Petals-style) |
| **M27 MoE Routing** | `hearthnet/moe/` | `research.moe_routing` | Route queries to best expert (model/service/human) via learned scorer |
| **M28 FedLearn** | `hearthnet/fedlearn/` | `research.fedlearn` | FedAvg LoRA fine-tuning without sharing raw data |
| **M29 LoRa Beacons** | `hearthnet/lora/` | `research.lora_beacons` | 868 MHz offline "I'm alive" heartbeats via USB LoRa stick |
| **M30 Evidence Graph** | `hearthnet/evidence/` | `research.evidence` | Claim β attest β dispute provenance graph + EBKH bridge |
| **M31 Civil Defense** | `hearthnet/civdef/` | `research.civil_defense` | THW/DRK/KatS alert pipeline with role certs + audit chain |
| **M32 Protocol Standard** | `hearthnet/services/protocol/` | on by default | Protocol version list + conformance report |
### Cross-Cutting
| ID | Location | What it does |
|----|----------|-------------|
| **X01 Transport** | `hearthnet/transport/` | HTTP/SSE client, backpressure, rate limiting, frame types |
| **X02 Events** | `hearthnet/events/` | SQLite Lamport event log + gossip sync |
| **X03 Observability** | `hearthnet/observability/` | Tracing, metrics, Doctor health checks, TrackioExporter |
| **X04 Config** | `hearthnet/config.py` | Typed TOML config + ResearchConfig feature flags |
| **X05 DHT** | `hearthnet/dht/` | Kademlia-inspired DHT for cross-LAN peer lookup |
| **X06 WebSocket** | `hearthnet/transport/` | WebSocket pubsub (StateBus β live UI push) |
| **X07 Federated Metrics** | `hearthnet/observability/` | Opt-in aggregate mesh health metrics |
| **X08 Tensor Transport** | `hearthnet/transport/tensor/` | Chunked tensor stream for M26 distributed inference |
| **X09 Conformance Suite** | `hearthnet/conformance/` | 21-check black-box conformance runner |
---
## Composition Root
`HearthNode` in [hearthnet/node.py](hearthnet/node.py) is the single composition root.
```python
node = HearthNode(
node_id="my-node",
display_name="Alice's Pi",
community_id="ed25519:abc123",
)
node.install_services(corpus="general")
await node.start()
```
`install_services()` registers all services the local hardware supports into the bus. Heavy optional dependencies (torch, chromadb, etc.) are imported lazily and fail gracefully β a node with no GPU still works, it just can't answer GPU-only capabilities.
---
## Capability Bus
```
Caller ββββ bus.call(name, version, body) βββββββββββ
βΌ
ββββββββββββββββββββ
β CapabilityBus β
β β
β Registry β
β βββββββββββββββ β
β β local route βββΌβββΊ Service.handle()
β βββββββββββββββ€ β
β β remote routeβββΌβββΊ HTTP POST /bus/v1/call
β βββββββββββββββ β
β HealthMonitor β
β TrustFilter β
ββββββββββββββββββββ
```
- **Local route** β service is installed on this node β direct Python call.
- **Remote route** β capability is advertised by a peer β HTTP POST to that peer's transport.
- **Version negotiation** β capabilities are registered with a `(major, minor)` version; the bus picks the highest compatible version.
- **Health monitoring** β each service's response times are tracked in a ring buffer; unhealthy services are quarantined for `BUS_QUARANTINE_SECONDS`.
---
## Data Flow: LLM Chat Request
```
User types in Gradio UI
β
βΌ
app.py (Gradio event handler)
β bus.call("llm.chat@1.0", body)
βΌ
CapabilityBus.call()
β
ββ local LlmService found?
β β yes β LlmService.handle() β backend.chat() β yield Token
β β
ββ no local service
β peer has llm.chat?
ββ yes β HTTP POST /bus/v1/call β remote node β stream tokens back
ββ no β CapabilityError("not_found")
```
---
## Discovery Flow
```
Node boots
β
βββ mDNS: register _hearthnet._tcp.local. (LAN multicast DNS)
βββ UDP: send announce to 224.0.0.251:7079 every 15s
β
βΌ
PeerRegistry receives announcements from other nodes
β
βββ new peer β RegistryEvent(kind="added", entry=...)
βββ peer gone (TTL expired) β RegistryEvent(kind="removed", ...)
βββ ManifestPublisher re-publishes every 300s
```
---
## Emergency Mode
```
EmergencyDetector (async loop, 30s probe)
β
βββ probe connectivity endpoints
β
βββ ONLINE β EmergencyState.NORMAL
β β UI shows normal theme
β
βββ OFFLINE β EmergencyState.EMERGENCY
β UI switches to emergency theme (red)
β emergency.llm.chat capability activated
β LoRa beacons sent if hardware available (M29)
β Civil defense alerts published if role cert present (M31)
```
---
## MoE Expert Routing (M27)
```
Query arrives at any node
β
βΌ
MoeRouter.route(query, top_k=3)
β
βββ score all registered ExpertDescriptors against query
β (tag overlap + cosine similarity + recency weighting)
β
βββ return ranked RouteResult
β
βββ expert_type="model" β bus.call(f"llm.chat@1.0", ...) on that node
βββ expert_type="service" β bus.call(expert_capability, ...)
βββ expert_type="human" β notify via chat + start handoff timer (M27 Β§4)
βββ expert_type="external"β HTTP call to opt-in external API
```
Enable it: set `research.moe_routing = true` in `~/.config/hearthnet/config.toml`.
---
## Distributed Inference (M26 β BitTorrent-style LLM sharing)
```
Node A: layers 0β15 of Llama-3.2-3B
Node B: layers 16β27 of Llama-3.2-3B
Node C: layers 28β35 (lm_head) of Llama-3.2-3B
β
βΌ
PipelineOrchestrator.plan(model_id="llama3.2:3b")
β β discovers shards via experimental.distributed_llm.shard.list
β β checks layer coverage: 0..35 β
β
PipelineOrchestrator.run(pipeline, input_tokens)
β β sends activations AβB via X08 TensorTransport (1 MiB chunks)
β β B sends activations BβC
β β C returns final logits
β
βββ caller gets streamed tokens like any local model
```
Model weights are shared chunk-by-chunk using BLAKE3 CID-addressed blob transfer β same
mechanism as file blobs (M07), but optimised for `.gguf` / `.safetensors` files.
---
## File Tree
```
hearthnet/
βββ node.py # HearthNode β composition root
βββ types.py # Shared type aliases (NodeID, ShardID, AlertID, β¦)
βββ constants.py # All numeric defaults and limits
βββ config.py # HearthnetConfig + ResearchConfig (TOML-backed)
βββ cli.py # Click CLI entry point
βββ facades.py # HearthFacade β thin high-level API for app.py
βββ controller.py # HearthController β legacy thin wrapper
β
βββ bus/ # M03 CapabilityBus
β βββ router.py # routing logic (local β remote)
β βββ registry.py # CapabilityEntry, RegistryEvent, Diff
β βββ capability.py # CapabilityEntry dataclass
β βββ health.py # ring-buffer health monitor
β
βββ identity/ # M01
β βββ keys.py # Ed25519 key generation + signing
β βββ manifest.py # NodeManifest, CommunityManifest, CommunityPolicy, β¦
β βββ tokens.py # AuthToken, CapabilityToken
β
βββ discovery/ # M02
β βββ peers.py # mDNS + UDP multicast PeerRegistry
β
βββ transport/ # X01 / X06 / X08
β βββ client.py # HTTP + SSE client
β βββ streams.py # Frame, SseReader
β βββ backpressure.py # FlowControl, RateCheck, RateLimiter
β βββ tensor/ # X08 tensor chunked transport
β
βββ events/ # X02
β βββ log.py # SQLite Lamport event log
β βββ sync.py # Gossip SyncClient / SyncServer
β
βββ observability/ # X03
β βββ tracing.py # attach/detach trace context
β βββ metrics.py # MetricsCollector, TrackioExporter
β βββ doctor.py # DoctorResult, CheckResult, DoctorService
β
βββ services/ # M04 β M21 + M32
β βββ llm/ # M04 β backends: ollama, llama_cpp, lmstudio, hf_api, anthropic
β βββ rag/ # M05
β βββ marketplace/ # M06
β βββ chat/ # M10
β βββ embedding/ # M11
β βββ ocr/ # M17
β βββ translation/ # M18
β βββ stt_tts/ # M19
β βββ vision/ # M20
β βββ tools/ # M21
β βββ group_chat/ # M25
β βββ protocol/ # M32
β
βββ ui/ # M08
β βββ app.py # Gradio 8-tab entry point
β βββ tabs/ # one file per tab
β βββ theme.py # hearthnet_theme, emergency_theme
β βββ topology.py # TopologyComponent (mesh graph)
β βββ onboarding.py # first-run wizard + invite QR
β βββ mobile/ # M22 PWA manifest + service worker
β
βββ emergency/ # M09
β βββ detector.py # async probe loop
β βββ state.py # EmergencyState enum
β
βββ crypto/ # M23
β βββ channel.py # X25519 + ChaCha20-Poly1305
β
βββ blobs/ # M07
β βββ store.py # BLAKE3 CID store + chunked reader
β
βββ dht/ # X05
βββ federation/ # M14
βββ relay/ # M15
β
βββ distributed_inference/ # M26 (experimental)
βββ moe/ # M27 (experimental)
βββ fedlearn/ # M28 (experimental)
βββ lora/ # M29 (experimental)
βββ evidence/ # M30 (experimental)
βββ civdef/ # M31 (experimental)
βββ conformance/ # X09
```
---
## Configuration
`~/.config/hearthnet/config.toml` (created on first run with defaults):
```toml
[node]
node_id = "" # auto-generated Ed25519 key ID
display_name = "My Node"
data_dir = "~/.hearthnet"
[transport]
http_port = 7080
ui_port = 7860
[llm]
default_backend = "ollama" # "ollama" | "llama_cpp" | "lmstudio" | "hf_api" | "smollm"
[rag]
corpus_dir = "~/.hearthnet/corpus"
embedding_model = "BAAI/bge-small-en-v1.5"
[policy.research]
enable = false # master switch for all experimental modules
moe_routing = false # M27
distributed_inference = false # M26
fedlearn = false # M28
lora_beacons = false # M29
evidence = false # M30
civil_defense = false # M31
```
---
## Connecting a Local Node to the HF Space
The HF Space at `https://huggingface.co/spaces/build-small-hackathon/HearthNet` is a
single-node anchor you can peer with from any local machine.
```bash
# 1. Clone and install
git clone https://huggingface.co/spaces/build-small-hackathon/HearthNet
cd HearthNet
pip install -e .
# 2. Run your local node (pick a free port if 7080 is taken)
python -m hearthnet.cli run --http-port 7080 --ui-port 7860
# 3. Manually add the HF Space anchor as a peer (different network = manual)
python -m hearthnet.cli call discovery.peer.add 1 0 \
'{"endpoint":"https://build-small-hackathon-hearthnet.hf.space","node_id":"hf-space-anchor"}'
# 4. Verify peering
python -m hearthnet.cli call discovery.peers 1 0 '{}'
```
Or use the helper script:
```bash
python scripts/connect_to_hf.py
```
Once peered, your local node can:
- Route LLM queries **from** the HF Space to your local (better) model
- Push community posts that appear in the HF Space UI
- Share blob files across the connection
> **Note:** The HF Space runs on a public server without a static IP for inbound connections.
> Your local node initiates the connection; the HF Space cannot discover you via mDNS.
> Use `discovery.peer.add` or the invite flow to establish the bridge manually.
---
## Security Model
- **Node identity** β Ed25519 key pair generated locally, never leaves the device.
- **Trust levels** β `unknown` β `member` β `trusted` β `anchor`. Capabilities can require a minimum trust level.
- **Capability scoping** β `AuthToken` restricts which capabilities a caller may invoke.
- **Channel encryption** β M23 X25519 ECDH + ChaCha20-Poly1305 for inter-node transport (opt-in, defaults off).
- **Experimental capabilities** β Phase 3 modules are off by default and require explicit opt-in. The bus refuses to register them unless the feature flag is on.
- **No central authority** β there is no HearthNet.com, no certificate authority, no registration server. Trust is established peer-to-peer via invite chains.
---
## Testing
```bash
# Full suite (133 unit + integration tests):
pytest tests/ -q
# Skip slow E2E browser tests:
pytest tests/ -q -k "not e2e"
# Phase 3 experimental module tests only:
pytest tests/test_phase3_experimental.py -v
# Conformance runner (X09):
python -m hearthnet.conformance.runner --output conformance-report/
```
---
*This document is generated from the spec set in `docs/`. For per-module detail see:*
- *Phase 1+2: `00-OVERVIEW.md`, `CAPABILITY_CONTRACT.md`, `modules/M01-*.md` β¦*
- *Phase 3: `docs/p2_p3/IMPLEMENTATION_REFERENCE_p3.md`, `docs/p2_p3/M26-*.md` β¦*
|