File size: 17,882 Bytes
46bc2e2
5523a2c
04948f1
5523a2c
 
46bc2e2
a8d13b2
46bc2e2
5523a2c
46bc2e2
 
5523a2c
 
 
 
 
 
be5ee6d
04948f1
 
be5ee6d
04948f1
 
 
be5ee6d
04948f1
 
 
be5ee6d
04948f1
02805c2
5523a2c
 
 
 
 
 
 
 
 
02805c2
04948f1
 
 
 
 
 
02805c2
 
 
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02805c2
04948f1
02805c2
04948f1
02805c2
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02805c2
 
 
04948f1
be5ee6d
04948f1
be5ee6d
04948f1
be5ee6d
 
04948f1
be5ee6d
 
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be5ee6d
04948f1
 
be5ee6d
 
 
04948f1
 
 
be5ee6d
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5523a2c
04948f1
 
 
 
5523a2c
 
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be5ee6d
 
 
 
 
 
 
 
 
 
 
04948f1
be5ee6d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
04948f1
be5ee6d
04948f1
 
 
 
 
 
be5ee6d
04948f1
 
 
 
 
 
 
 
02805c2
04948f1
02805c2
be5ee6d
 
 
 
04948f1
 
 
 
 
 
 
 
 
be5ee6d
 
04948f1
 
 
 
 
 
 
 
 
 
be5ee6d
04948f1
 
 
 
 
 
be5ee6d
02805c2
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02805c2
be5ee6d
 
 
 
04948f1
 
 
 
 
 
be5ee6d
04948f1
 
 
02805c2
 
 
be5ee6d
02805c2
04948f1
 
 
 
 
 
 
 
 
 
 
 
 
5523a2c
be5ee6d
 
 
 
 
04948f1
02805c2
 
 
 
 
be5ee6d
 
 
 
 
04948f1
be5ee6d
 
 
04948f1
 
02805c2
a8d13b2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
---
title: Multi-LLM API Gateway
emoji: πŸ›‘οΈ
colorFrom: indigo
colorTo: red
sdk: docker
pinned: true
license: apache-2.0
short_description: LLM API Gateway with MCP interface
---

# Multi-LLM API Gateway with MCP interface
β€” or Universal MCP Hub (Sandboxed)  
β€” or universal AI Wrapper over SSE + Quart with some tools on a solid fundament

aka: a clean, secure starting point for your own projects.
Pick the description that fits your use case. They're all correct.

> A production-grade MCP server that actually thinks about security.  
> Built on [PyFundaments](PyFundaments.md) β€” running on **simpleCity** and **paranoidMode**.

```
No key β†’ no tool β†’ no crash β†’ no exposed secrets
```

Most MCP servers are prompts dressed up as servers. This one has a real architecture.

---

## Why this exists

While building this, we kept stumbling over the same problem β€” the MCP 
ecosystem is full of servers with hardcoded keys, `os.environ` scattered 
everywhere, zero sandboxing. One misconfigured fork and your API keys are gone.

This is exactly the kind of negligence (and worse β€” outright fraud) that 
[Wall of Shames](https://github.com/Wall-of-Shames) documents: a 
community project exposing fake "AI tools" that exploit non-technical users 
β€” API wrappers dressed up as custom models, Telegram payment funnels, 
bought stars. If you build on open source, you should know this exists.

This hub was built as the antidote:

- **Structural sandboxing** β€” `app/*` can never touch `fundaments/` or `.env`. Not by convention. By design.
- **Guardian pattern** β€” `main.py` is the only process that reads secrets. It injects validated services as a dict. `app/*` never sees the raw environment.
- **Graceful degradation** β€” No key? Tool doesn't register. Server still starts. No crash, no error, no empty `None` floating around.
- **Single source of truth** β€” All tool/provider/model config lives in `app/.pyfun`. Adding a provider = edit one file. No code changes.

---

## Architecture

```
main.py (Guardian)
β”‚
β”‚  reads .env / HF Secrets
β”‚  initializes fundaments/* conditionally
β”‚  injects validated services as dict
β”‚
└──► app/app.py (Orchestrator, sandboxed)
     β”‚
     β”‚  unpacks fundaments ONCE, at startup, never stores globally
     β”‚  starts hypercorn (async ASGI)
     β”‚  routes: GET / | POST /api | GET+POST /mcp
     β”‚
     β”œβ”€β”€ app/mcp.py         ← FastMCP + SSE handler
     β”œβ”€β”€ app/tools.py       ← Tool registry (key-gated)
     β”œβ”€β”€ app/provider.py    ← LLM + Search execution + fallback chain
     β”œβ”€β”€ app/models.py      ← Model limits, costs, capabilities
     β”œβ”€β”€ app/config.py      ← .pyfun parser (single source of truth)
     └── app/db_sync.py     ← Internal SQLite IPC (app/* state only)
                              β‰  fundaments/postgresql.py (Guardian-only)
```

**The sandbox is structural:**

```python
# app/app.py β€” fundaments are unpacked ONCE, NEVER stored globally
async def start_application(fundaments: Dict[str, Any]) -> None:
    config_service         = fundaments["config"]
    db_service             = fundaments["db"]          # None if not configured
    encryption_service     = fundaments["encryption"]  # None if keys missing
    access_control_service = fundaments["access_control"]
    ...
    # From here: app/* reads its own config from app/.pyfun only.
    # fundaments are never passed into other app/* modules.
```

`app/app.py` never calls `os.environ`. Never imports from `fundaments/`. Never reads `.env`.  
This isn't documentation. It's enforced by the import structure.

### Why Quart + hypercorn?

MCP over SSE needs a proper async HTTP stack. The choice here is deliberate:

**Quart** is async Flask β€” same API, same routing, but fully `async/await` native. This matters because FastMCP's SSE handler is async, and mixing sync Flask with async MCP would require thread hacks or `asyncio.run()` gymnastics. With Quart, the `/mcp` route hands off directly to `mcp.handle_sse(request)` β€” no bridging, no blocking.

**hypercorn** is an ASGI server (vs. waitress/gunicorn which are WSGI). WSGI servers handle one request per thread β€” fine for traditional web apps, wrong for SSE where a connection stays open for minutes. hypercorn handles SSE connections as long-lived async streams without tying up threads. It also runs natively on HuggingFace Spaces without extra config.

The `/mcp` route in `app.py` is also the natural interception point β€” auth checks, rate limiting, payload logging can all be added there before the request ever reaches FastMCP. That's not possible when FastMCP runs standalone.

---

## Two Databases β€” One Architecture

This hub runs **two completely separate databases** with distinct responsibilities. This is not redundancy β€” it's a deliberate performance and security decision.

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Guardian Layer (fundaments/*)                              β”‚
β”‚                                                             β”‚
β”‚  postgresql.py   β†’ Cloud DB (e.g. Neon, Supabase)          β”‚
β”‚                    asyncpg pool, SSL enforced               β”‚
β”‚                    Neon-specific quirks handled             β”‚
β”‚                    (statement_timeout stripped, keepalives) β”‚
β”‚                                                             β”‚
β”‚  user_handler.py β†’ SQLite (users + sessions tables)        β”‚
β”‚                    PBKDF2-SHA256 password hashing           β”‚
β”‚                    Session validation incl. IP + UserAgent  β”‚
β”‚                    Account lockout after 5 failed attempts  β”‚
β”‚                    Path: SQLITE_PATH env var or app/        β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚ inject as fundaments dict
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  App Layer (app/*)                                          β”‚
β”‚                                                             β”‚
β”‚  db_sync.py  β†’ SQLite (hub_state + tool_cache tables)      β”‚
β”‚                aiosqlite (async, non-blocking)              β”‚
β”‚                NEVER touches users/sessions tables          β”‚
β”‚                Relocated to /tmp/ on HF Spaces auto        β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Why two SQLite databases?**

`user_handler.py` (Guardian) owns `users` and `sessions` β€” authentication state that must be isolated from the app layer. `db_sync.py` (app/*) owns `hub_state` and `tool_cache` β€” fast, async IPC between tools that doesn't need to leave the process, let alone hit a cloud endpoint.

A tool caching a previous LLM response or storing intermediate state between pipeline steps should never wait on a round-trip to Neon. Local SQLite is microseconds. Cloud PostgreSQL is 50-200ms per query. For tool-to-tool communication, that difference matters.

**Table ownership β€” hard rule:**

| Table | Owner | Access |
| :--- | :--- | :--- |
| `users` | `fundaments/user_handler.py` | Guardian only |
| `sessions` | `fundaments/user_handler.py` | Guardian only |
| `hub_state` | `app/db_sync.py` | app/* only |
| `tool_cache` | `app/db_sync.py` | app/* only |

`db_sync.py` uses the same SQLite path (`SQLITE_PATH`) as `user_handler.py` β€” same file, different tables, zero overlap. The `db_query` MCP tool exposes SELECT-only access to `hub_state` and `tool_cache`. It cannot reach `users` or `sessions`.

**Cloud DB (postgresql.py):**

Handles the heavy cases β€” persistent storage, workflow tool results that need to survive restarts, anything that benefits from a real relational DB. Neon-specific quirks are handled automatically: `statement_timeout` is stripped from the DSN (Neon doesn't support it), SSL is enforced at `require` minimum, keepalives are set, and terminated connections trigger an automatic pool restart.

If no `DATABASE_URL` is set, the entire cloud DB layer is skipped cleanly. The app runs without it.

---

## Tools

Tools register themselves at startup β€” only if the required API key exists in the environment. No key, no tool. The server always starts.

| ENV Secret | Tool | Notes |
| :--- | :--- | :--- |
| `ANTHROPIC_API_KEY` | `llm_complete` | Claude Haiku / Sonnet / Opus |
| `GEMINI_API_KEY` | `llm_complete` | Gemini 2.0 / 2.5 / 3.x Flash & Pro |
| `OPENROUTER_API_KEY` | `llm_complete` | 100+ models via OpenRouter |
| `HF_TOKEN` | `llm_complete` | HuggingFace Inference API |
| `BRAVE_API_KEY` | `web_search` | Independent web index |
| `TAVILY_API_KEY` | `web_search` | AI-optimized search with synthesized answers |
| `DATABASE_URL` | `db_query` | Read-only SELECT β€” enforced at app level |
| *(always)* | `list_active_tools` | Shows key names only β€” never values |
| *(always)* | `health_check` | Status + uptime |
| *(always)* | `get_model_info` | Limits, costs, capabilities per model |

**Configured in `.pyfun` β€” not hardcoded:**

```ini
[TOOL.code_review]
active           = "true"
description      = "Review code for bugs, security issues and improvements"
provider_type    = "llm"
default_provider = "anthropic"
timeout_sec      = "60"
system_prompt    = "You are an expert code reviewer. Analyze the given code for bugs, security issues, and improvements. Be specific and concise."
[TOOL.code_review_END]
```

Current built-in tools: `llm_complete`, `code_review`, `summarize`, `translate`, `web_search`, `db_query`  
Future hooks (commented, ready): `image_gen`, `code_exec`, `shellmaster`, Discord, GitHub webhooks

---

## LLM Fallback Chain

All LLM providers share one `llm_complete` tool. If a provider fails, the hub automatically walks the fallback chain defined in `.pyfun`:

```
anthropic β†’ gemini β†’ openrouter β†’ huggingface
```

Fallbacks are configured per-provider, not hardcoded:

```ini
[LLM_PROVIDER.anthropic]
fallback_to = "gemini"
[LLM_PROVIDER.anthropic_END]

[LLM_PROVIDER.gemini]
fallback_to = "openrouter"
[LLM_PROVIDER.gemini_END]
```

Same pattern applies to search providers (`brave β†’ tavily`).

---

## Quick Start

### HuggingFace Spaces (recommended)

1. Fork / duplicate this Space
2. Go to **Settings β†’ Variables and secrets**
3. Add the API keys you have (any subset works)
4. Space starts automatically β€” only tools with valid keys register

That's it. No config editing. No code changes.

[β†’ Live Demo Space](https://huggingface.co/spaces/codey-lab/Multi-LLM-API-Gateway) (no LLM keys set!)

### Local / Docker

```bash
git clone https://github.com/VolkanSah/Multi-LLM-API-Gateway
cd Multi-LLM-API-Gateway
cp example-mcp___.env .env
# fill in your keys
pip install -r requirements.txt
python main.py
```

Minimum required ENV vars (everything else is optional):

```env
PYFUNDAMENTS_DEBUG=""
LOG_LEVEL="INFO"
LOG_TO_TMP=""
ENABLE_PUBLIC_LOGS="true"
HF_TOKEN=""
HUB_SPACE_URL=""
MCP_TRANSPORT="sse"
```

---

## Connect an MCP Client

### Claude Desktop / any SSE-compatible client

```json
{
  "mcpServers": {
    "universal-mcp-hub": {
      "url": "https://YOUR_USERNAME-universal-mcp-hub.hf.space/sse"
    }
  }
}
```

### Private Space (with HF token)

```json
{
  "mcpServers": {
    "universal-mcp-hub": {
      "url": "https://YOUR_USERNAME-universal-mcp-hub.hf.space/sse",
      "headers": {
        "Authorization": "Bearer hf_..."
      }
    }
  }
}
```

---

## Desktop Client

A full PySide6 desktop client is included in `DESKTOP_CLIENT/hub.py` β€” ideal for private or non-public Spaces where you don't want to expose the SSE endpoint.

```bash
pip install PySide6 httpx
# optional file handling:
pip install Pillow PyPDF2 pandas openpyxl
python DESKTOP_CLIENT/hub.py
```

**Features:**
- Multi-chat with persistent history (`~/.mcp_desktop.json`)
- Tool/Provider/Model selector loaded live from your Hub
- File attachments: images, PDF, CSV, Excel, ZIP, source code
- Connect tab with health check + auto-load
- Settings: HF Token + Hub URL saved locally, never sent anywhere except your own Hub
- Full request/response log with timestamps
- Runs on Windows, Linux, macOS

[β†’ Desktop Client docs](DESKTOP_CLIENT/README.md)

---

## Configuration (.pyfun)

`app/.pyfun` is the single source of truth for all app behavior. Three tiers β€” use what you need:

```
LAZY:       [HUB] + one [LLM_PROVIDER.*]                    β†’ works
NORMAL:     + [SEARCH_PROVIDER.*] + [MODELS.*]              β†’ works better  
PRODUCTIVE: + [TOOLS] + [HUB_LIMITS] + [DB_SYNC]           β†’ full power
```

Adding a new LLM provider requires two steps β€” `.pyfun` + one line in `providers.py`:

```ini
# 1. app/.pyfun β€” add provider block
[LLM_PROVIDER.mistral]
active        = "true"
base_url      = "https://api.mistral.ai/v1"
env_key       = "MISTRAL_API_KEY"
default_model = "mistral-large-latest"
models        = "mistral-large-latest, mistral-small-latest, codestral-latest"
fallback_to   = ""
[LLM_PROVIDER.mistral_END]
```

```python
# 2. app/providers.py β€” uncomment the dummy + register it
_PROVIDER_CLASSES = {
    ...
    "mistral": MistralProvider,   # ← uncomment to activate
}
```

`providers.py` ships with ready-to-use commented dummy classes for OpenAI, Mistral, and xAI/Grok β€” each with the matching `.pyfun` block right above it. Most OpenAI-compatible APIs need zero changes to the class itself, just a different `base_url` and `env_key`. Search providers (Brave, Tavily) follow the same pattern and are next on the roadmap.

Model limits, costs, and capabilities are also configured here β€” `get_model_info` reads directly from `.pyfun`:

```ini
[MODEL.claude-sonnet-4-6]
provider           = "anthropic"
context_tokens     = "200000"
max_output_tokens  = "16000"
requests_per_min   = "50"
cost_input_per_1k  = "0.003"
cost_output_per_1k = "0.015"
capabilities       = "text, code, analysis, vision"
[MODEL.claude-sonnet-4-6_END]
```

---

## Dependencies

```
# PyFundaments Core (always required)
asyncpg          β€” async PostgreSQL pool (Guardian/cloud DB)
python-dotenv    β€” .env loading
passlib          β€” PBKDF2 password hashing in user_handler.py
cryptography     β€” encryption layer in fundaments/

# MCP Hub
fastmcp          β€” MCP protocol + tool registration
httpx            β€” async HTTP for all provider API calls
quart            β€” async Flask (ASGI) β€” needed for SSE + hypercorn
hypercorn        β€” ASGI server β€” long-lived SSE connections, HF Spaces native
requests         β€” sync HTTP for tool workers

# Optional (uncomment in requirements.txt as needed)
# aiofiles       β€” async file ops (ML pipelines, file uploads)
# discord.py     β€” Discord bot integration (app/discord_api.py, planned)
# PyNaCl         β€” Discord signature verification
# psycopg2-binary β€” alternative PostgreSQL driver
```

The core stack is intentionally lean. `asyncpg` + `quart` + `hypercorn` + `fastmcp` + `httpx` covers the full MCP server. Everything else is opt-in.

---

## Security Design

- API keys live in HF Secrets / `.env` β€” never in `.pyfun`, never in code
- `list_active_tools` returns key **names** only β€” never values
- `db_query` is SELECT-only, enforced at application level (not just docs)
- `app/*` has zero import access to `fundaments/` internals
- Direct execution of `app/app.py` is blocked by design β€” prints a warning and uses a null-fundaments dict
- `fundaments/` is initialized conditionally β€” missing services degrade gracefully, they don't crash

> PyFundaments is not perfect. But it's more secure than most of what runs in production today.

[β†’ Full Security Policy](SECURITY.md)

---

## Foundation

This hub is built on [PyFundaments](PyFundaments.md) β€” a security-first Python boilerplate providing:

- `config_handler.py` β€” env loading with validation
- `postgresql.py` β€” async DB pool (Guardian-only)
- `encryption.py` β€” key-based encryption layer
- `access_control.py` β€” role/permission management
- `user_handler.py` β€” user lifecycle management  
- `security.py` β€” unified security manager composing the above

None of these are accessible from `app/*`. They are injected as a validated dict by `main.py`.

[β†’ PyFundaments Function Overview](PyFundaments%20–%20Function%20Overview.md)  
[β†’ Module Docs](docs/app/)
[β†’ Source of this REPO](https://github.com/VolkanSah/Multi-LLM-API-Gateway)

---

## History

[ShellMaster](https://github.com/VolkanSah/ChatGPT-ShellMaster) (2023, MIT) was the precursor β€” browser-accessible shell for ChatGPT with session memory via `/tmp/shellmaster_brain.log`, built before MCP was even a concept. Universal MCP Hub is its natural evolution.

---

## License

Dual-licensed:

- [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
- [Ethical Security Operations License v1.1 (ESOL)](ESOL) β€” mandatory, non-severable

By using this software you agree to all ethical constraints defined in ESOL v1.1.

---

*Architecture, security decisions, and PyFundaments by Volkan KΓΌcΓΌkbudak.*  
*Built with Claude (Anthropic) as a typing assistant for docs & the occasional bug.*

> crafted with passion β€” just wanted to understand how it works, don't actually need it, have a CLI πŸ˜„