devsomosahub commited on
Commit
6929073
Β·
verified Β·
1 Parent(s): 04a4a4c

Upload docs/TECHNICAL.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. docs/TECHNICAL.md +247 -0
docs/TECHNICAL.md ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agent OS - Documentacao Tecnica
2
+
3
+ ## Visao Geral
4
+
5
+ Agent OS e um sistema operacional de agentes IA. Uma interface desktop-like (macOS-style) que orquestra multiplos modelos de IA especializados atraves de um modelo gestor central. Cada agente e um "slot" plugavel que pode ser trocado, adicionado ou removido facilmente.
6
+
7
+ ## Arquitetura
8
+
9
+ ```
10
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
11
+ β”‚ AGENT OS (Frontend React) β”‚
12
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β” β”‚
13
+ β”‚ β”‚ Browser β”‚ Terminal β”‚ Inbox β”‚Mission β”‚ Agents β”‚Finder β”‚ β”‚
14
+ β”‚ β”‚ β”‚ β”‚ β”‚Control β”‚ β”‚ β”‚ β”‚
15
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
16
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
17
+ β”‚ SERVER (Node.js + Express) β”‚
18
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
19
+ β”‚ β”‚ Browser β”‚ PM2 β”‚ Supabase β”‚ GitHub β”‚ SmolAgent β”‚ β”‚
20
+ β”‚ β”‚ Manager β”‚ API β”‚ API β”‚ CLI β”‚ Daemon β”‚ β”‚
21
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
22
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
23
+ β”‚ CAMADA DE ORQUESTRACAO β”‚
24
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
25
+ β”‚ β”‚ MODELO GESTOR (Orchestrator) β”‚ β”‚
26
+ β”‚ β”‚ Recebe tarefa β†’ Classifica β†’ Roteia β†’ Retorna β”‚ β”‚
27
+ β”‚ β”‚ Llama 3.3 70B via HF Inference API (gratis, Pro) β”‚ β”‚
28
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
29
+ β”‚ β”‚ β”‚ β”‚ β”‚
30
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
31
+ β”‚ β”‚ Agent Slot 1 β”‚ β”‚ Agent Slot 2 β”‚ β”‚ Agent Slot N β”‚ β”‚
32
+ β”‚ β”‚ Coding β”‚ β”‚ SQL/Data β”‚ β”‚ (plugavel) β”‚ β”‚
33
+ β”‚ β”‚ Opus 4.6 API β”‚ β”‚ 1.5B local β”‚ β”‚ qualquer modelo β”‚ β”‚
34
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
35
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
36
+ β”‚ MEMORIA CENTRAL β”‚
37
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
38
+ β”‚ β”‚ Basic Memory (MCP Server) β”‚ β”‚
39
+ β”‚ β”‚ Markdown files + SQLite + Vector Embeddings β”‚ β”‚
40
+ β”‚ β”‚ Persistente entre sessoes e agentes β”‚ β”‚
41
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
42
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
43
+ ```
44
+
45
+ ## Stack Atual
46
+
47
+ ### Frontend (React + TypeScript + Vite)
48
+ - **Desktop**: Interface macOS-style com janelas arrastΓ‘veis, dock, menu bar
49
+ - **Window Manager**: Zustand store (`useAppStore`) com persist
50
+ - **Apps**: Registry plugavel (`appRegistry.ts`)
51
+ - **Componentes**: Desktop, Dock, MenuBar, WindowFrame
52
+
53
+ ### Backend (Node.js + Express)
54
+ - **Porta**: 3000
55
+ - **APIs implementadas**:
56
+ - `/api/browsers/*` - Browser automation via Playwright (criar, navegar, clicar, digitar, screenshot)
57
+ - `/api/pm2/*` - Process manager (listar, restart, stop, logs)
58
+ - `/api/supabase/*` - Supabase CLI proxy (auth, projetos, tabelas, SQL)
59
+ - `/api/github/*` - GitHub CLI proxy (auth, repos, issues, PRs, notificacoes)
60
+ - `/api/smol/chat` - SmolAgent daemon proxy (porta 8082)
61
+ - `/api/launcher/*` - Proxy pro claude-launcher-web (porta 3002)
62
+ - `/ws` - WebSocket para terminal, browser streaming, file watching
63
+
64
+ ### Infra (Vultr - 207.246.65.100)
65
+ - **OS**: Ubuntu, 4 CPU, 8GB RAM, sem GPU
66
+ - **Processos ativos**:
67
+ - `server.js` (porta 3000) - Frontend + API
68
+ - `smol-daemon.py` (porta 8082) - SmolAgent backend
69
+ - `llama-server` (porta 8080) - Modelo 1.5B local CPU
70
+ - `launcher/server.js` (porta 3002) - Claude launcher
71
+ - `agent-bot` (porta 9090) - Bot auxiliar
72
+
73
+ ## App Registry
74
+
75
+ O sistema de apps e plugavel. Cada app registrado no `appRegistry.ts`:
76
+
77
+ | App | ID | Status | Descricao |
78
+ |-----|----|--------|-----------|
79
+ | Browser | `browser` | Funcional | Web browser com Playwright |
80
+ | Terminal | `terminal` | Funcional | Terminal session via launcher |
81
+ | Inbox | `inbox` | Placeholder | Task inbox com comments |
82
+ | Mission Control | `mission-control` | Placeholder | Kanban board de tarefas |
83
+ | Agents | `agents` | Placeholder | Org chart de agentes |
84
+ | Finder | `finder` | Placeholder | File browser de workspaces |
85
+ | Settings | `settings` | Placeholder | Configuracoes do sistema |
86
+
87
+ ### Interface de um App
88
+ ```typescript
89
+ interface AppRegistryEntry {
90
+ id: string; // ID unico
91
+ name: string; // Nome exibido
92
+ icon: string; // Icone do dock
93
+ description: string; // Descricao
94
+ component: React.LazyExoticComponent; // Componente React
95
+ defaultSize: { width, height };
96
+ minSize?: { width, height };
97
+ allowMultiple?: boolean; // Multiplas instancias
98
+ dockPinned?: boolean; // Fixado no dock
99
+ }
100
+ ```
101
+
102
+ ## Agent Registry (A IMPLEMENTAR)
103
+
104
+ Sistema de registro de agentes especializados. Cada agente e um slot plugavel:
105
+
106
+ ```typescript
107
+ interface AgentSlot {
108
+ id: string; // "coder", "sql", "text", "frontend"
109
+ name: string; // "Coding Agent"
110
+ description: string; // "Especializado em..."
111
+ provider: "anthropic" | "openai" | "openrouter" | "huggingface" | "local";
112
+ config: {
113
+ model: string; // "claude-opus-4-6" ou "agent-os-1b5"
114
+ endpoint?: string; // URL do endpoint (local ou API)
115
+ apiKey?: string; // Chave da API
116
+ temperature?: number;
117
+ maxTokens?: number;
118
+ };
119
+ capabilities: string[]; // ["code", "sql", "text", "reasoning"]
120
+ active: boolean; // Ativado/desativado
121
+ }
122
+ ```
123
+
124
+ ### Agentes Planejados
125
+
126
+ | Slot | Modelo | Provider | Funcao |
127
+ |------|--------|----------|--------|
128
+ | **Gestor/Orquestrador** | Llama 3.3 70B | HF Inference API (gratis Pro) | Roteia tarefas, classifica intencao, coordena agentes |
129
+ | **Coding** | Claude Opus 4.6 | Anthropic API | Escreve/refatora codigo |
130
+ | **SQL/Data** | agent-os-1b5 (custom) | Local llama-server | Queries SQL, Supabase, information_schema |
131
+ | **Frontend** | (a definir) | (a definir) | UI/UX, componentes React |
132
+ | **Texto** | (a definir) | (a definir) | Criacao de conteudo, copywriting |
133
+ | **Pesquisa** | (a definir) | (a definir) | Web search, analise de dados |
134
+
135
+ ### Fluxo de Orquestracao
136
+
137
+ ```
138
+ 1. Usuario digita mensagem no chat
139
+ 2. Gestor (Llama 70B) analisa a intencao:
140
+ - "escreve uma funcao que..." β†’ routing: coder
141
+ - "quantas vms ativas..." β†’ routing: sql
142
+ - "cria um texto sobre..." β†’ routing: text
143
+ 3. Gestor envia pra o agente especializado
144
+ 4. Agente processa e retorna resultado
145
+ 5. Gestor formata e entrega ao usuario
146
+ 6. Memoria Central registra a interacao
147
+ ```
148
+
149
+ ### Troca de Agentes
150
+ O usuario pode a qualquer momento:
151
+ - Trocar o modelo de um slot (ex: mudar coder de Opus pra GPT-4)
152
+ - Adicionar novo slot
153
+ - Desativar um slot
154
+ - Escolher manualmente qual agente usar
155
+
156
+ ## Memoria Central: Basic Memory
157
+
158
+ ### O que e
159
+ Sistema de memoria persistente baseado em Markdown + SQLite + Vector Embeddings. Opera como MCP Server.
160
+
161
+ ### Arquitetura
162
+ ```
163
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
164
+ β”‚ Markdown Files β”‚
165
+ β”‚ - YAML frontmatter (metadata) β”‚
166
+ β”‚ - [category] observations β”‚
167
+ β”‚ - [[wiki-links]] relations β”‚
168
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
169
+ β”‚ SQLite Index β”‚
170
+ β”‚ - Full-text search β”‚
171
+ β”‚ - Vector embeddings (FastEmbed) β”‚
172
+ β”‚ - Hybrid search β”‚
173
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
174
+ β”‚ MCP Server β”‚
175
+ β”‚ - memory:// URLs β”‚
176
+ β”‚ - CRUD de notas β”‚
177
+ β”‚ - Navegacao semantica β”‚
178
+ β”‚ - Context building β”‚
179
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
180
+ ```
181
+
182
+ ### Uso no Agent OS
183
+ - **Contexto entre sessoes**: Agentes mantem conhecimento entre conversas
184
+ - **Knowledge graph**: Relacoes entre entidades (projetos, decisoes, aprendizados)
185
+ - **Multi-agente**: Todos os agentes leem/escrevem na mesma memoria
186
+ - **Orquestrador documenta**: O gestor registra cada interacao e decisao
187
+ - **Humano edita**: Usuario pode editar arquivos Markdown diretamente
188
+
189
+ ### Claude Session Logger
190
+ Complementa o Basic Memory registrando automaticamente:
191
+ - Sessoes de conversa com Claude
192
+ - Ferramentas utilizadas
193
+ - Decisoes tomadas
194
+ - Erros e solucoes
195
+
196
+ ## Modelos Treinados (Custom)
197
+
198
+ ### agent-os-adapter-1.5b
199
+ - **Base**: Qwen 2.5 1.5B Instruct
200
+ - **Treino**: LoRA (r=32, alpha=64), 7 epochs, 415 exemplos x4
201
+ - **Funcao**: Converter linguagem natural β†’ JSON (SQL, CLI, shell)
202
+ - **Deploy**: GGUF Q8 no llama-server (CPU, 1.6GB RAM, ~3s/query)
203
+ - **Repos**:
204
+ - Adapter: `devsomosahub/agent-os-adapter-1.5b`
205
+ - Merged: `devsomosahub/agent-os-1b5-merged`
206
+
207
+ ### agent-os-adapter-7b
208
+ - **Base**: Qwen 2.5 7B Instruct
209
+ - **Treino**: LoRA Q4, mesma config
210
+ - **Funcao**: Mesma, mas mais preciso
211
+ - **Repos**:
212
+ - Adapter: `devsomosahub/agent-os-adapter-7b`
213
+ - Merged: `devsomosahub/agent-os-7b-merged`
214
+
215
+ ### Limitacao conhecida
216
+ Modelos custom inventam nomes de colunas baseados no dataset de treino quando fazem queries diretas. Solucao: fluxo de 2 passos (information_schema primeiro, depois query com colunas reais).
217
+
218
+ ## APIs Externas Utilizadas
219
+
220
+ | Servico | Uso | Autenticacao |
221
+ |---------|-----|-------------|
222
+ | HuggingFace (Pro) | Inference API gratis (Llama 70B), treinamento, endpoints | Token HF |
223
+ | Anthropic | Claude Opus 4.6 para coding agent | API Key |
224
+ | OpenRouter | LLMs alternativos, fallback | API Key |
225
+ | Vultr | Servidores (VMs dos boards, server Agent OS) | API Key |
226
+ | Supabase | Banco de dados dos projetos (Cloud-Hub, Hubia) | Access Token |
227
+ | GitHub | Repos, issues, PRs | gh CLI token |
228
+
229
+ ## Portas do Server (207.246.65.100)
230
+
231
+ | Porta | Servico | Acesso |
232
+ |-------|---------|--------|
233
+ | 3000 | Agent OS (frontend + API) | Publico |
234
+ | 3002 | Claude Launcher Web | Interno |
235
+ | 8080 | llama-server (modelo 1.5B) | Interno |
236
+ | 8082 | SmolAgent daemon | Interno |
237
+ | 9090 | Agent bot | Interno |
238
+
239
+ ## Proximos Passos
240
+
241
+ 1. **Implementar Agent Registry** - Config JSON de agentes plugaveis
242
+ 2. **Implementar Orquestrador** - Gestor que roteia entre agentes
243
+ 3. **Integrar Basic Memory** - MCP Server como memoria central
244
+ 4. **Integrar Session Logger** - Log automatico de sessoes
245
+ 5. **Implementar apps Placeholder** - Inbox, Mission Control, Agents, Finder, Settings
246
+ 6. **Modelo guia/assistente** - Treinar modelo que explica o sistema ao usuario
247
+ 7. **Dashboard de agentes** - UI para ver/trocar/configurar agentes em tempo real