smartclass-ops / README.md
balaji958685's picture
Update ML Intern artifact metadata
8852353 verified
---
tags:
- ml-intern
---
# SmartClass Deployment & Operations
Complete infrastructure for the SmartClass face-recognition attendance system.
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ Docker Compose Stack │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌──────────────────────┐ │
│ │ Redis │ │PostgreSQL│ │ API │ │ API Worker │ │
│ │ :6379 │ │ :5432 │ │ :8000 │ │ (background jobs) │ │
│ └────┬────┘ └────┬─────┘ └────┬────┘ └──────────┬───────────┘ │
│ │ │ │ │ │
│ ┌────┴─────────────┴─────────────┴───────────────────┘ │
│ │ │
│ ┌─────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Edge │ │ Frontend │ │ Prometheus │ │
│ │ :9100 │ │ :5173 │ │ :9090 │ │
│ └─────────┘ └──────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
## Quick Start
### 1. Clone and Configure
```bash
git clone <repo-url> smartclass
cd smartclass
# Create environment file
cp .env.example .env
# Generate a secure JWT secret
python -c "import secrets; print(f'JWT_SECRET_KEY={secrets.token_hex(32)}')" >> .env
```
### 2. Start All Services
```bash
# Start the full stack
docker compose up -d
# Check all services are running
docker compose ps
# View logs
docker compose logs -f
```
### 3. Run Database Migrations
```bash
docker compose exec api alembic upgrade head
```
### 4. Verify Deployment
```bash
# Quick health check
./scripts/healthcheck.sh
# Full smoke test
python scripts/postdeploy_smoke_check.py
```
## Project Structure
```
smartclass-ops/
├── docker-compose.yml # Full service stack
├── Dockerfile # Edge node container
├── .env.example # Environment template
├── .github/
│ └── workflows/
│ └── ci-cd.yml # CI/CD pipeline
├── services/
│ └── api/
│ ├── Dockerfile # API server container
│ ├── alembic.ini # Migration config
│ ├── alembic/
│ │ ├── env.py # Async SQLAlchemy support
│ │ ├── script.py.mako # Migration template
│ │ └── versions/
│ │ └── 001_initial.py # Initial schema
│ └── init_db.sql # DB initialization
├── frontend/
│ ├── Dockerfile # Multi-stage React build
│ └── nginx.conf # SPA routing config
├── monitoring/
│ ├── prometheus.yml # Scrape configuration
│ ├── smartclass_alerts.yml # Alert rules (6 alerts)
│ └── grafana_dashboard.json # Pre-built dashboard
├── scripts/
│ ├── healthcheck.sh # System health check
│ ├── postdeploy_smoke_check.py # Post-deploy validation
│ ├── create_deploy_bundle.py # Bundle creation for edge
│ ├── apply_deploy_bundle.ps1 # Bundle deployment (PowerShell)
│ └── register_edge_node.py # Edge node registration
├── src/
│ └── edge_metrics.py # Prometheus metrics exporter
└── config/
└── edge_config.yaml # Edge node configuration
```
## Service Management
### Start/Stop
```bash
# Start all
docker compose up -d
# Stop all (preserve data)
docker compose down
# Stop all (remove data)
docker compose down -v
# Restart single service
docker compose restart api
```
### Rebuild After Code Changes
```bash
# Rebuild API
docker compose up -d --build api
# Rebuild edge
docker compose up -d --build edge
# Rebuild all
docker compose up -d --build
```
### View Logs
```bash
docker compose logs -f api # API server
docker compose logs -f edge # Edge pipeline
docker compose logs -f api_worker # Background worker
docker compose logs --tail 100 api # Last 100 lines
```
## Database Migrations
```bash
# Apply all pending migrations
docker compose exec api alembic upgrade head
# Create new migration after model changes
docker compose exec api alembic revision --autogenerate -m "Add new table"
# Rollback last migration
docker compose exec api alembic downgrade -1
# View migration history
docker compose exec api alembic history
```
## Edge Node Deployment
### Register a New Edge Node
```bash
python scripts/register_edge_node.py \
--node-id rpi5-room301 \
--section AIML-3-A \
--api-url http://central-server:8000
```
### Create Deployment Bundle
```bash
python scripts/create_deploy_bundle.py --section AIML-3-A
python scripts/create_deploy_bundle.py --section AIML-3-A --include-models
```
### Deploy to Edge Node (PowerShell on RPi5)
```powershell
powershell -ExecutionPolicy Bypass -File scripts/apply_deploy_bundle.ps1 `
-BundleZip data/deploy/latest_bundle.zip `
-SectionKey AIML-3-A `
-DataDir data
```
### Deploy via Docker
```bash
docker run -d \
--name smartclass-edge \
-v /opt/smartclass/config:/opt/smartclass/config:ro \
-v /opt/smartclass/models:/opt/smartclass/models:ro \
-v /opt/smartclass/data:/opt/smartclass/data \
-p 9100:9100 \
-e REDIS_URL=redis://central-server:6379 \
-e EDGE_CONFIG_PATH=/opt/smartclass/config/edge_config.yaml \
smartclass-edge:latest
```
## Monitoring
### Access Points
| Service | URL | Purpose |
|---------|-----|---------|
| Prometheus | http://localhost:9090 | Metrics & alerts |
| Grafana | http://localhost:3000 | Dashboards (if added) |
| Edge Metrics | http://localhost:9100/metrics | Raw edge metrics |
| API Health | http://localhost:8000/health | API status |
### Alert Rules
| Alert | Trigger | Severity |
|-------|---------|----------|
| SmartClassLowFPS | FPS < 5 for 60s | Warning |
| SmartClassHighCPUTemp | CPU > 80°C for 60s | Critical |
| SmartClassHighMemoryUsage | Memory > 85% for 60s | Warning |
| SmartClassOfflineQueueBacklog | Queue > 1000 for 2m | Warning |
| SmartClassEdgeUnreachable | Node down for 2m | Critical |
| SmartClassHighRecognitionLatency | Latency > 50ms for 60s | Warning |
### Add New Edge Node to Prometheus
Edit `monitoring/prometheus.yml` and add under `smartclass-edge` targets:
```yaml
- targets: ["192.168.1.103:9100"]
labels:
service: "edge"
section: "CSE-2-B"
location: "room-205"
environment: "production"
```
Then reload Prometheus:
```bash
curl -X POST http://localhost:9090/-/reload
```
## CI/CD Pipeline
The GitHub Actions pipeline (`.github/workflows/ci-cd.yml`) runs:
1. **Python CI** — ruff lint, mypy type check, pytest with coverage
2. **Frontend CI** — ESLint, TypeScript check, build, tests
3. **Docker Build** — Build and push images to GHCR (on push to main)
4. **Deploy** — SSH deploy to production (on version tags)
### Trigger a Release
```bash
git tag v1.2.0
git push origin v1.2.0
```
Images will be published to:
- `ghcr.io/<owner>/smart-attendance-edge:v1.2.0`
- `ghcr.io/<owner>/smart-attendance-api:v1.2.0`
- `ghcr.io/<owner>/smart-attendance-frontend:v1.2.0`
## Troubleshooting
### API Container Won't Start
```bash
# Check logs
docker compose logs api
# Common fixes:
# 1. Database not ready - check postgres health
docker compose exec postgres pg_isready
# 2. Missing JWT_SECRET_KEY
grep JWT_SECRET_KEY .env
# 3. Run migrations
docker compose exec api alembic upgrade head
```
### Edge Node Can't Connect to Redis
```bash
# Check Redis is running
docker compose exec redis redis-cli ping
# Check from edge container
docker compose exec edge python -c "import redis; r=redis.from_url('redis://redis:6379'); print(r.ping())"
# Check network
docker network inspect smartclass-ops_smartclass-net
```
### Frontend Shows Blank Page
```bash
# Check build args
docker compose logs frontend
# Rebuild with correct API URL
VITE_API_URL=http://your-api-server:8000 docker compose up -d --build frontend
```
### High Memory Usage / OOM
```bash
# Check container memory usage
docker stats --no-stream
# Increase limits in docker-compose.yml:
# deploy:
# resources:
# limits:
# memory: 2G
```
## Environment Variables Reference
| Variable | Service | Required | Default |
|----------|---------|----------|---------|
| `POSTGRES_USER` | postgres | No | sc_user |
| `POSTGRES_PASSWORD` | postgres | Yes | - |
| `POSTGRES_DB` | postgres | No | smartclass |
| `JWT_SECRET_KEY` | api | Yes | - |
| `REDIS_URL` | api, worker, edge | No | redis://redis:6379 |
| `CORS_ORIGINS` | api | No | ["http://localhost:5173"] |
| `ENVIRONMENT` | api | No | development |
| `LOG_LEVEL` | api, worker | No | info |
| `EDGE_MODE` | edge | No | test |
| `VITE_API_URL` | frontend | No | http://localhost:8000 |
## Security Notes
- **Never** commit `.env` files to version control
- JWT secret must be **256-bit minimum** for production
- Use a secrets vault (HashiCorp Vault, AWS Secrets Manager) in production
- The `.gitignore` should include: `.env`, `data/`, `models/`, `*.pem`
- API runs as non-root user inside container
- Edge node runs as non-root user inside container
<!-- ml-intern-provenance -->
## Generated by ML Intern
This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "balaji958685/smartclass-ops"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
```
For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.