MAC / docs /MAC-PROGRESS.md
Aaryan17's picture
chore: upload MAC codebase to HF Space
0e76632 verified

MAC β€” MBM AI Cloud Β· Build Progress

Project: Self-hosted AI inference platform for MBM University Jodhpur
Stack: FastAPI Β· SvelteKit Β· PostgreSQL Β· Redis Β· vLLM Β· Nginx Β· Docker
Repo: D:\mac2 (push to github.com/mbmuniversity2026/MAC)


βœ… Completed

Session 1 β€” Backend Foundation

Database / Migrations

  • Alembic wired β€” alembic/env.py imports all models
  • 20260426_0001_initial_schema.py β€” full initial schema capture
  • 20260427_0002_session1_tables.py β€” feature_flags, system_config, branches, sections, cluster_heartbeats, shared_files, file_downloads, video_projects, video_jobs + user columns

New Models (mac/models/)

File Tables
feature_flag.py FeatureFlag
academic.py Branch, Section
cluster.py ClusterNode, ClusterHeartbeat
file_share.py SharedFile, FileDownload
video.py VideoProject, VideoJob
system_config.py SystemConfig

New Routers (mac/routers/)

File Endpoints
features.py GET /features/status, PATCH /admin/features/{key}
hardware.py GET /hardware/local, /hardware/recommendations
network.py GET /network/local-ip, /network/discover
system.py GET /system/version, /system/update-status, POST /admin/system/restart
setup.py GET /setup/status, POST /setup/create-admin, GET /setup/recovery

New Services (mac/services/)

  • feature_seeder.py β€” seeds default flags on startup
  • setup_service.py β€” JWT secret management via system_config
  • token_blacklist_service.py β€” JWT blacklist via Redis (TTL-matched)

Security Improvements

  • JWT jti claim added to all access tokens (mac/utils/security.py)
  • auth_middleware.py checks blacklist on every request
  • Logout blacklists current access token + revokes refresh tokens

Session 2 β€” SvelteKit Frontend

Full PWA frontend at frontend/:

File Description
src/app.html PWA shell, Google Fonts, SW registration
src/app.css Full design system β€” dark theme, mac-blue palette, all component classes
src/lib/api.js Complete API client (auth, query, models, usage, quota, keys, features, hardware, network, system, users, guardrails, rag, notifications, cluster, academic, files)
src/lib/stores.js authStore, chatStore, setupStore, featureStore, toast, sidebarOpen
src/lib/i18n.js 19 Indian languages with lazy loading + RTL support
src/lib/components/ParticleCanvas.svelte Physics particle animation (login/splash)
src/lib/components/Sidebar.svelte Navigation sidebar with all routes
src/lib/components/Toast.svelte Toast notification component
src/lib/components/ChatMessage.svelte Chat bubble with markdown rendering
src/routes/+layout.svelte Auth guard, setup check, shell layout
src/routes/+page.svelte Landing / redirect
src/routes/login/+page.svelte Animated login page with particle canvas
src/routes/setup/+page.svelte First-run admin setup wizard
src/routes/chat/+page.svelte SSE streaming chat with model picker
src/routes/dashboard/+page.svelte Activity heatmap, quota rings, model distribution
src/routes/admin/+page.svelte Admin panel: Users, Models, Features, Hardware, System tabs
src/routes/cluster/+page.svelte Cluster management: node list, detail, actions, history chart, enrollment tokens
src/routes/keys/+page.svelte API key management (generate, copy, revoke)
src/routes/settings/+page.svelte Profile, change password, language picker
src/routes/notifications/+page.svelte Notification list with mark-read
src/routes/rag/+page.svelte RAG document upload (drag-and-drop) + list

Static assets:

  • static/manifest.json β€” PWA manifest with shortcuts
  • static/sw.js β€” Service worker (cache-first shell, network-first API, SSE passthrough)

Infrastructure:

  • nginx/nginx.conf β€” HTTP server (production)
  • nginx/nginx.https.conf β€” HTTPS server with TLS, HSTS, WebSocket proxy
  • docker-compose.yml β€” Master node: MAC API + vLLM + Postgres + Redis + Nginx + Qdrant + SearXNG
  • docker-compose.worker.yml β€” Worker node: vLLM + optional Jupyter + worker-agent

Session 2 β€” Distributed Cluster Backend

Cluster Architecture

Master node (this machine)
  β”œβ”€β”€ MAC API (FastAPI)          β€” receives all user requests
  β”œβ”€β”€ PostgreSQL                 β€” DB (master-only)
  β”œβ”€β”€ Redis                      β€” cache, rate limiting, JWT blacklist
  β”œβ”€β”€ Nginx                      β€” reverse proxy + frontend
  β”œβ”€β”€ Qdrant                     β€” vector DB for RAG
  └── SearXNG                    β€” web search

Worker nodes (any PC on same network)
  β”œβ”€β”€ vLLM                       β€” GPU inference (OpenAI-compatible)
  β”œβ”€β”€ Jupyter kernel gateway     β€” notebook execution (optional)
  └── worker_agent.py            β€” heartbeat + registration agent

Cluster Services

  • mac/services/load_balancer.py β€” score-based routing: gpu_utilΓ—0.5 + vram_ratioΓ—0.3, 30s stale threshold
  • mac/services/llm_service.py β€” updated _resolve_model_cluster to use load balancer before local vLLM
  • mac/models/node.py β€” WorkerNode + NodeModelDeployment + EnrollmentToken; added notebook_port, tags
  • mac/models/cluster.py β€” ClusterHeartbeat time-series

Cluster Router (mac/routers/cluster.py)

Endpoint Description
POST /cluster/enroll-token Admin generates one-time enrollment token
GET /cluster/enroll-tokens List all tokens
POST /cluster/register Worker self-registers (no JWT β€” uses enrollment token)
POST /cluster/heartbeat Worker sends heartbeat every 10s
GET /cluster/nodes List all nodes with live health
GET /cluster/nodes/{id} Node detail with deployments
POST /cluster/nodes/{id}/action approve / drain / reactivate / remove
POST /cluster/nodes/{id}/deploy Register vLLM deployment on node
DELETE /cluster/nodes/{id}/deploy/{dep_id} Remove deployment
GET /cluster/nodes/{id}/history Heartbeat time-series (for charts)

Worker Agent (worker_agent.py)

Standalone Python script for worker PCs:

  • Reads MAC_MASTER_URL, MAC_ENROLL_TOKEN, MAC_VLLM_PORT, etc. from env
  • Self-registers on startup via enrollment token
  • Sends heartbeats every 10s with GPU/CPU/RAM metrics (via pynvml + psutil)
  • Queries local vLLM /v1/models to report active models
  • Handles stale/auth errors gracefully

Other New Routers

File Endpoints
mac/routers/academic.py CRUD for branches and sections
mac/routers/file_share.py Admin upload, user download, stats

πŸ”² Remaining / Optional

Item Priority Notes
Frontend PWA icons Medium static/icon-192.png, static/icon-512.png, static/favicon.ico β€” need actual PNG files
Frontend: refresh token flow Medium Silent JWT refresh in api.js before expiry
Multi-stage Dockerfile Low Stage 1: node build frontend; Stage 2: python + nginx
Feature flag wiring Low feature_required("ai_chat") on /query/*, etc.
HTTPS cert setup Deployment Use nginx.https.conf + Let's Encrypt / self-signed
alembic/versions/0003 When schema changes Node notebook_port and tags columns
Video generation service Future VideoProject / VideoJob models exist, router not yet created

Deployment Quick-Start

Master node

# 1. Build frontend
cd frontend && npm install && npm run build && cd ..

# 2. Configure environment
cp .env.example .env   # edit DB, Redis, model settings

# 3. Run DB migrations
docker compose up postgres -d
docker compose run --rm mac alembic upgrade head

# 4. Start all services
docker compose up -d

Adding a worker node

# On the master β€” generate enrollment token
curl -X POST http://MASTER_IP:8000/api/v1/cluster/enroll-token \
  -H "Authorization: Bearer ADMIN_JWT" \
  -d '{"label":"Lab PC 1","expires_hours":24}'

# On the worker PC
MAC_MASTER_URL=http://MASTER_IP:8000 \
MAC_ENROLL_TOKEN=<token_from_above> \
MAC_VLLM_PORT=8001 \
docker compose -f docker-compose.worker.yml up -d

# Then approve the node in MAC admin panel β†’ Cluster tab

HTTPS (production)

# Place certs in nginx/ssl/
# Swap nginx config:
# In docker-compose.yml, change:
#   volumes: ./nginx/nginx.conf β†’ ./nginx/nginx.https.conf
# Then restart nginx

Last updated: 2026-04-27 β€” Session 2 complete