MAC β MBM AI Cloud Β· Build Progress
Project: Self-hosted AI inference platform for MBM University Jodhpur
Stack: FastAPI Β· SvelteKit Β· PostgreSQL Β· Redis Β· vLLM Β· Nginx Β· Docker
Repo: D:\mac2 (push to github.com/mbmuniversity2026/MAC)
β Completed
Session 1 β Backend Foundation
Database / Migrations
- Alembic wired β
alembic/env.pyimports all models 20260426_0001_initial_schema.pyβ full initial schema capture20260427_0002_session1_tables.pyβ feature_flags, system_config, branches, sections, cluster_heartbeats, shared_files, file_downloads, video_projects, video_jobs + user columns
New Models (mac/models/)
| File | Tables |
|---|---|
feature_flag.py |
FeatureFlag |
academic.py |
Branch, Section |
cluster.py |
ClusterNode, ClusterHeartbeat |
file_share.py |
SharedFile, FileDownload |
video.py |
VideoProject, VideoJob |
system_config.py |
SystemConfig |
New Routers (mac/routers/)
| File | Endpoints |
|---|---|
features.py |
GET /features/status, PATCH /admin/features/{key} |
hardware.py |
GET /hardware/local, /hardware/recommendations |
network.py |
GET /network/local-ip, /network/discover |
system.py |
GET /system/version, /system/update-status, POST /admin/system/restart |
setup.py |
GET /setup/status, POST /setup/create-admin, GET /setup/recovery |
New Services (mac/services/)
feature_seeder.pyβ seeds default flags on startupsetup_service.pyβ JWT secret management via system_configtoken_blacklist_service.pyβ JWT blacklist via Redis (TTL-matched)
Security Improvements
- JWT
jticlaim added to all access tokens (mac/utils/security.py) auth_middleware.pychecks blacklist on every request- Logout blacklists current access token + revokes refresh tokens
Session 2 β SvelteKit Frontend
Full PWA frontend at frontend/:
| File | Description |
|---|---|
src/app.html |
PWA shell, Google Fonts, SW registration |
src/app.css |
Full design system β dark theme, mac-blue palette, all component classes |
src/lib/api.js |
Complete API client (auth, query, models, usage, quota, keys, features, hardware, network, system, users, guardrails, rag, notifications, cluster, academic, files) |
src/lib/stores.js |
authStore, chatStore, setupStore, featureStore, toast, sidebarOpen |
src/lib/i18n.js |
19 Indian languages with lazy loading + RTL support |
src/lib/components/ParticleCanvas.svelte |
Physics particle animation (login/splash) |
src/lib/components/Sidebar.svelte |
Navigation sidebar with all routes |
src/lib/components/Toast.svelte |
Toast notification component |
src/lib/components/ChatMessage.svelte |
Chat bubble with markdown rendering |
src/routes/+layout.svelte |
Auth guard, setup check, shell layout |
src/routes/+page.svelte |
Landing / redirect |
src/routes/login/+page.svelte |
Animated login page with particle canvas |
src/routes/setup/+page.svelte |
First-run admin setup wizard |
src/routes/chat/+page.svelte |
SSE streaming chat with model picker |
src/routes/dashboard/+page.svelte |
Activity heatmap, quota rings, model distribution |
src/routes/admin/+page.svelte |
Admin panel: Users, Models, Features, Hardware, System tabs |
src/routes/cluster/+page.svelte |
Cluster management: node list, detail, actions, history chart, enrollment tokens |
src/routes/keys/+page.svelte |
API key management (generate, copy, revoke) |
src/routes/settings/+page.svelte |
Profile, change password, language picker |
src/routes/notifications/+page.svelte |
Notification list with mark-read |
src/routes/rag/+page.svelte |
RAG document upload (drag-and-drop) + list |
Static assets:
static/manifest.jsonβ PWA manifest with shortcutsstatic/sw.jsβ Service worker (cache-first shell, network-first API, SSE passthrough)
Infrastructure:
nginx/nginx.confβ HTTP server (production)nginx/nginx.https.confβ HTTPS server with TLS, HSTS, WebSocket proxydocker-compose.ymlβ Master node: MAC API + vLLM + Postgres + Redis + Nginx + Qdrant + SearXNGdocker-compose.worker.ymlβ Worker node: vLLM + optional Jupyter + worker-agent
Session 2 β Distributed Cluster Backend
Cluster Architecture
Master node (this machine)
βββ MAC API (FastAPI) β receives all user requests
βββ PostgreSQL β DB (master-only)
βββ Redis β cache, rate limiting, JWT blacklist
βββ Nginx β reverse proxy + frontend
βββ Qdrant β vector DB for RAG
βββ SearXNG β web search
Worker nodes (any PC on same network)
βββ vLLM β GPU inference (OpenAI-compatible)
βββ Jupyter kernel gateway β notebook execution (optional)
βββ worker_agent.py β heartbeat + registration agent
Cluster Services
mac/services/load_balancer.pyβ score-based routing:gpu_utilΓ0.5 + vram_ratioΓ0.3, 30s stale thresholdmac/services/llm_service.pyβ updated_resolve_model_clusterto use load balancer before local vLLMmac/models/node.pyβWorkerNode+NodeModelDeployment+EnrollmentToken; addednotebook_port,tagsmac/models/cluster.pyβClusterHeartbeattime-series
Cluster Router (mac/routers/cluster.py)
| Endpoint | Description |
|---|---|
POST /cluster/enroll-token |
Admin generates one-time enrollment token |
GET /cluster/enroll-tokens |
List all tokens |
POST /cluster/register |
Worker self-registers (no JWT β uses enrollment token) |
POST /cluster/heartbeat |
Worker sends heartbeat every 10s |
GET /cluster/nodes |
List all nodes with live health |
GET /cluster/nodes/{id} |
Node detail with deployments |
POST /cluster/nodes/{id}/action |
approve / drain / reactivate / remove |
POST /cluster/nodes/{id}/deploy |
Register vLLM deployment on node |
DELETE /cluster/nodes/{id}/deploy/{dep_id} |
Remove deployment |
GET /cluster/nodes/{id}/history |
Heartbeat time-series (for charts) |
Worker Agent (worker_agent.py)
Standalone Python script for worker PCs:
- Reads
MAC_MASTER_URL,MAC_ENROLL_TOKEN,MAC_VLLM_PORT, etc. from env - Self-registers on startup via enrollment token
- Sends heartbeats every 10s with GPU/CPU/RAM metrics (via
pynvml+psutil) - Queries local vLLM
/v1/modelsto report active models - Handles stale/auth errors gracefully
Other New Routers
| File | Endpoints |
|---|---|
mac/routers/academic.py |
CRUD for branches and sections |
mac/routers/file_share.py |
Admin upload, user download, stats |
π² Remaining / Optional
| Item | Priority | Notes |
|---|---|---|
| Frontend PWA icons | Medium | static/icon-192.png, static/icon-512.png, static/favicon.ico β need actual PNG files |
| Frontend: refresh token flow | Medium | Silent JWT refresh in api.js before expiry |
| Multi-stage Dockerfile | Low | Stage 1: node build frontend; Stage 2: python + nginx |
| Feature flag wiring | Low | feature_required("ai_chat") on /query/*, etc. |
| HTTPS cert setup | Deployment | Use nginx.https.conf + Let's Encrypt / self-signed |
alembic/versions/0003 |
When schema changes | Node notebook_port and tags columns |
| Video generation service | Future | VideoProject / VideoJob models exist, router not yet created |
Deployment Quick-Start
Master node
# 1. Build frontend
cd frontend && npm install && npm run build && cd ..
# 2. Configure environment
cp .env.example .env # edit DB, Redis, model settings
# 3. Run DB migrations
docker compose up postgres -d
docker compose run --rm mac alembic upgrade head
# 4. Start all services
docker compose up -d
Adding a worker node
# On the master β generate enrollment token
curl -X POST http://MASTER_IP:8000/api/v1/cluster/enroll-token \
-H "Authorization: Bearer ADMIN_JWT" \
-d '{"label":"Lab PC 1","expires_hours":24}'
# On the worker PC
MAC_MASTER_URL=http://MASTER_IP:8000 \
MAC_ENROLL_TOKEN=<token_from_above> \
MAC_VLLM_PORT=8001 \
docker compose -f docker-compose.worker.yml up -d
# Then approve the node in MAC admin panel β Cluster tab
HTTPS (production)
# Place certs in nginx/ssl/
# Swap nginx config:
# In docker-compose.yml, change:
# volumes: ./nginx/nginx.conf β ./nginx/nginx.https.conf
# Then restart nginx
Last updated: 2026-04-27 β Session 2 complete