Spaces:
Running
Deploy Guide β Render.com
End-to-end walkthrough for deploying HasarΔ° to Render.com using render.yaml. Covers prerequisites, infra setup, environment configuration, the first deploy, smoke tests, monitoring, rollback, and cost.
Target audience: anyone with shell access and a Render account, no prior Render experience required.
Prerequisites
Before you start, you need:
| Item | Why | How to get it |
|---|---|---|
| Render account | Hosts the API + worker | render.com β free tier ok for the web service; Postgres/Redis are paid |
| GitHub access to the repo | Render builds from git | arac-hasar-v2 repo permissions |
| AWS S3 bucket (or compatible) | Image storage (uploads + visualizations) | Or use Cloudflare R2 / Backblaze B2 β anything S3-compatible |
AWS IAM access key with s3:GetObject, s3:PutObject, s3:DeleteObject on the bucket |
Backend uploads/signs URLs | AWS console β IAM |
| Custom domain (optional) | Branded URL | Any registrar, point CNAME at Render |
A strong JWT_SECRET_KEY |
Signs auth tokens | openssl rand -base64 48 |
| Sentry DSN (optional) | Error tracking | sentry.io |
Time estimate: 45β90 minutes for the first deploy. Subsequent deploys are git-push fast.
Step 1 β Provision infrastructure
The render.yaml at the repo root declares all services. Render reads it on first connect and creates everything in one go.
1a. Create the Postgres database
In the Render dashboard:
- New + β PostgreSQL
- Name:
hasari-db - Database:
hasari - User:
hasari - Region: pick the one nearest your users (Frankfurt for EU/TR, Oregon for US)
- Plan: Starter ($7/month) is sufficient for the pilot. Free tier expires after 90 days β do not use it for production.
- Create database.
Copy the Internal Database URL (format: postgres://hasari:β¦@β¦/hasari). You'll paste it as DATABASE_URL in step 2.
1b. Create the Redis instance
- New + β Redis
- Name:
hasari-redis - Region: same as Postgres
- Plan: Starter ($10/month). Free tier has no persistence; do not use for production.
- Maxmemory policy:
allkeys-lru - Create Redis.
Copy the Internal Redis URL β you'll use it as REDIS_URL.
1c. Create the S3 bucket
In the AWS console:
- S3 β Create bucket β
hasari-uploads-prod(or your name), region matching the API for low latency - Block all public access β yes, keep all four boxes checked. The backend serves presigned URLs; no public listing.
- Versioning: disabled (uploads are immutable; no need for revisions)
- Server-side encryption: SSE-S3 (default) is fine
- After creation: Permissions β CORS β add:
[
{
"AllowedHeaders": ["*"],
"AllowedMethods": ["GET", "PUT", "POST"],
"AllowedOrigins": ["https://hasari.app", "https://hasari-api.onrender.com"],
"ExposeHeaders": ["ETag", "x-amz-request-id"],
"MaxAgeSeconds": 3000
}
]
Replace the origins with your actual web app URL.
- IAM β create an IAM user
hasari-backend, attach an inline policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::hasari-uploads-prod/*"
},
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::hasari-uploads-prod"
}
]
}
Create an access key for this user β save both halves now, you cannot retrieve the secret again.
Step 2 β Configure the backend service
2a. Connect the repo
- Render dashboard β New + β Blueprint
- Connect your GitHub account, pick
arac-hasar-v2 - Render detects
render.yamland shows the services it will create:hasari-apiβ web service (FastAPI)hasari-workerβ background worker (Celery)
- Click Apply to create both.
2b. Set environment variables
For each of hasari-api and hasari-worker, go to Environment and add:
Required
| Name | Example | Description | Security note |
|---|---|---|---|
ENVIRONMENT |
production |
Disables dev auth fallback | Never set to dev here |
DATABASE_URL |
postgres://β¦ |
From step 1a | Internal URL only; no external traffic |
REDIS_URL |
redis://β¦ |
From step 1b | Internal URL only |
JWT_SECRET_KEY |
(32+ char random string) | Signs JWTs | Generate fresh: openssl rand -base64 48. Rotating invalidates all existing sessions. |
S3_BUCKET |
hasari-uploads-prod |
Bucket name | β |
S3_REGION |
eu-central-1 |
AWS region | β |
S3_ACCESS_KEY |
AKIA⦠|
From step 1c | Use a dedicated IAM user, not your root key |
S3_SECRET_KEY |
β¦ |
From step 1c | Mark this var as "secret" in Render UI |
S3_ENDPOINT_URL |
(blank for AWS) | Set only for R2/MinIO/B2 | β |
CORS_ORIGINS |
https://hasari.app,https://www.hasari.app |
Comma-separated allowed web origins | Never use * in production |
Recommended
| Name | Default | Description |
|---|---|---|
ACCESS_TOKEN_MINUTES |
30 |
Short access TTL β keeps damage from a stolen token bounded |
REFRESH_TOKEN_DAYS |
7 |
Refresh TTL β balance UX vs. risk |
MAX_IMAGE_SIZE_MB |
12 |
Per-image upload limit |
MAX_IMAGES_SYNC |
5 |
Sync mode cap |
MAX_IMAGES_ASYNC |
20 |
Async mode cap |
SENTRY_DSN |
(blank) | Enable Sentry error tracking |
LOG_LEVEL |
INFO |
DEBUG for troubleshooting; keep INFO in prod |
ML service
| Name | Default | Description |
|---|---|---|
ML_MODEL_DIR |
/app/models |
Path to YOLO .pt weight files inside the container |
ML_DEVICE |
cpu |
cuda requires a GPU instance (Render does not offer GPU β keep CPU on Render and offload heavy ML to a separate GPU host or external service for production) |
GPU note: Render does not currently offer GPU instances. For the pilot, the backend runs YOLO on CPU β slower (~5β10Γ CPU vs. GPU). For production loads above ~50 inspections/hour, host the ML service separately on a GPU VPS (Hetzner, RunPod, etc.) and point
ML_SERVICE_URLat it. Architecture diagram in README.md.
2c. Build & start commands
If render.yaml is missing these, set them manually:
hasari-api (web service):
- Build:
pip install -r services/backend/requirements.txt - Start:
cd services/backend && alembic upgrade head && uvicorn main:app --host 0.0.0.0 --port $PORT
hasari-worker (background worker):
- Build: same
- Start:
cd services/backend && celery -A worker worker --loglevel=info --concurrency=2
Step 3 β First deploy
- Both services auto-deploy on push to
main. Trigger the first deploy:- Go to hasari-api β Manual Deploy β Deploy latest commit.
- Watch the build log. Expected duration: 8β15 minutes (downloads PyTorch, YOLO weights).
- The first start runs
alembic upgrade headβ DB schema is created. - Once "Your service is live" appears, hit the health endpoint:
curl https://hasari-api.onrender.com/health
Expected:
{"status":"ok","ml_loaded":true,"timestamp":"2026-05-15T...","version":"0.1.0"}
If ml_loaded is false, check the start log for "ML pipeline init failed" β usually means model weights are missing from the container. Run a one-off SSH session and run python -c "from ml_service import ml_pipeline; print(ml_pipeline.is_loaded())".
Step 4 β Create the admin user
The first admin must be created out-of-band β there is no admin-registration UI. SSH into the API container via the Render dashboard (Shell tab) and run:
cd services/backend
python -c "
from database import init_db
from auth import _repo
from security import hash_password
init_db()
user = _repo.create(
email='admin@yourcompany.com',
password_hash=hash_password('CHANGE_ME_now_strong_password'),
full_name='Admin',
)
# Promote
import psycopg
with psycopg.connect('$DATABASE_URL') as conn:
conn.execute('UPDATE users SET role=%s WHERE id=%s', ('admin', user['id']))
conn.commit()
print('Admin user created:', user['email'])
"
Sign in immediately at https://hasari.app/login and rotate the password through the UI.
Step 5 β Smoke test checklist
Before announcing the deploy is "done", run through this list. Each item should pass on the first try.
-
GET /healthreturns 200 withml_loaded: true -
GET /api/v1/versionreturns expected git SHA and build time -
POST /auth/registerwith a new email returns 201 + token pair -
POST /auth/loginwith that email returns 200 + new token pair -
GET /auth/mewith the access token returns the user -
POST /auth/refreshwith the refresh token returns a new token pair -
POST /api/v1/inspect/syncwith a 1MB JPG returns 200 with parts/damages JSON within 15 seconds -
GET /api/v1/inspectreturns the inspection in the list -
GET /api/v1/inspect/{id}/visualization/annotatedredirects to a presigned S3 URL that returns a PNG -
DELETE /api/v1/inspect/{id}removes it (subsequent GET returns 404) - Web app loads at custom domain, language defaults to TR
- Sign in via web app, complete one inspection end-to-end
- Open Render logs β no
ERRORorCRITICALentries in the past hour - Postgres connection count < 20 (visible in Render β hasari-db β Metrics)
If any item fails, do not announce the launch. See Troubleshooting.
Monitoring & log access
Logs
- Render dashboard: hasari-api β Logs tab β live tail.
- CLI:
render logs --service hasari-api --tail(install render-cli). - Structured JSON: every log line is
{"time":..., "level":..., "logger":..., "msg":...}β pipe tojqfor filtering.
Metrics
- Render built-in: CPU, memory, response time, throughput visible per service in the dashboard.
- Prometheus: scrape
https://hasari-api.onrender.com/metrics(requiresAuthorization: Bearer <admin token>). Seeobservability/for a Grafana dashboard JSON to import.
Alerts
Configure in Render β service β Notifications:
- Deploy failed β Slack/email
- Service crashed β on-call rotation
- Disk usage > 80% β Slack
For app-level alerts (error rate > 1%, p95 latency > 3s), set up Sentry alerts on the SENTRY_DSN project.
Rolling back a bad deploy
If the latest deploy is broken:
- Render dashboard β hasari-api β Events tab.
- Find the last known-good deploy (green checkmark).
- Click Rollback to this deploy.
- Confirm. Render redeploys the previous Docker image β takes ~30 seconds.
For database migrations that cannot be rolled back automatically:
# In the Render shell:
cd services/backend
alembic downgrade -1
Important: never alembic downgrade a migration that dropped a column with live data β you will lose data. Pre-launch, test every migration's downgrade() against a copy of production data.
Cost estimate (monthly, pilot scale)
| Item | Plan | Cost |
|---|---|---|
Render web service (hasari-api) |
Starter (512 MB) | $7 |
Render background worker (hasari-worker) |
Starter (512 MB) | $7 |
| Render Postgres | Starter | $7 |
| Render Redis | Starter | $10 |
| AWS S3 (10 GB storage, 100k req/month) | Pay-as-you-go | ~$1 |
| AWS data transfer (out) | Pay-as-you-go | ~$2 |
| Custom domain | (you own it) | $0 |
| Sentry (free tier) | Developer | $0 |
| Total | ~$34/month |
Scaling beyond ~500 inspections/day will require:
- Larger Render plans (Standard: $25/service)
- Moving ML to a GPU VPS (Hetzner GPU: $80/month)
- S3 storage growth: $0.023/GB/month
Troubleshooting
ml_loaded: false at startup
Cause: model weights missing or wrong path.
Fix: ensure services/ml/yolo11m-seg.pt, yolo11s-seg.pt, yolo11n-cls.pt are committed to the repo or downloaded in the build step. Check ML_MODEL_DIR env var.
503 "Is kuyrugu su an kullanilamiyor"
Cause: Celery worker can't reach Redis.
Fix: confirm REDIS_URL is set on hasari-worker (not just api). Restart worker.
Postgres connection limit exceeded
Cause: too many open connections β usually a long-running query or leaked sessions.
Fix: check Render β Postgres β Metrics β "Active connections". Restart API service to drop them. Add pool_pre_ping=True and pool_recycle=300 to SQLAlchemy engine config.
S3 403 on upload
Cause: IAM policy doesn't grant s3:PutObject, or bucket name typo, or wrong region.
Fix: run aws s3 ls s3://your-bucket-name --region eu-central-1 from anywhere with the same credentials to verify.
Web app CORS error
Cause: CORS_ORIGINS doesn't include the actual origin (don't forget https://, no trailing slash).
Fix: update env var, redeploy API.
Health check passing but inspections always fail with 500
Cause: usually an unhandled exception in the ML pipeline (corrupt model, OOM, missing class).
Fix: enable LOG_LEVEL=DEBUG, reproduce, read the traceback in logs. If it's OOM, upgrade to Standard plan or move ML off-box.
Migration locked / alembic upgrade head hangs
Cause: a previous migration left a lock in the alembic_version table.
Fix: in psql, DELETE FROM alembic_locks; (table name varies β check your alembic config) or set LOCK_TIMEOUT and retry.
Post-launch monitoring (first 48 hours)
Watch these metrics every 4 hours for the first two days:
- Error rate (Sentry): target < 0.5% of requests
- API latency p95 (Render metrics): target < 1.5 s for non-inspection endpoints
- Inspection latency p95: target < 12 s end-to-end for 4-photo batches
- Database active connections: target < 50% of pool max
- Redis memory: target < 80% of plan limit
- Failed inspection rate: target < 2% of jobs reaching
failed
If any metric exceeds target for 30+ minutes, treat as a P1 incident.
Related docs
- API_GUIDE.md β REST contract for smoke tests
- AUTH_FLOW.md β token lifecycle (env vars relevant)
- LAUNCH_CHECKLIST.md β pre-go-live sign-off gates
- OBSERVABILITY_SETUP.md β Prometheus + Grafana wiring