Spaces:
Running
Running
File size: 12,653 Bytes
80d8c84 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 | # Deployment Guide (Max / Person C)
---
## Local Development
```bash
# Create and activate virtualenv
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install server deps
pip install -r server/requirements.txt
# Install replicalab package
pip install -e . --no-deps
# Run the server
uvicorn server.app:app --host 0.0.0.0 --port 7860 --reload
```
Server should be available at `http://localhost:7860`.
Quick smoke test:
```bash
curl http://localhost:7860/health
curl -X POST http://localhost:7860/reset \
-H "Content-Type: application/json" \
-d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'
```
---
## Docker (Local)
```bash
docker build -f server/Dockerfile -t replicalab .
docker run -p 7860:7860 replicalab
```
### Verified endpoints (API 08 sign-off, 2026-03-08)
After `docker run -p 7860:7860 replicalab`, the following were verified
against the **real env** (not stub):
```bash
curl http://localhost:7860/health
# β {"status":"ok","env":"real"}
curl http://localhost:7860/scenarios
# β {"scenarios":[{"family":"math_reasoning",...}, ...]}
curl -X POST http://localhost:7860/reset \
-H "Content-Type: application/json" \
-d '{"seed":42,"scenario":"math_reasoning","difficulty":"easy"}'
# β {"session_id":"...","episode_id":"...","observation":{...}}
# Use session_id from reset response:
curl -X POST http://localhost:7860/step \
-H "Content-Type: application/json" \
-d '{"session_id":"<SESSION_ID>","action":{"action_type":"propose_protocol","sample_size":3,"controls":["baseline"],"technique":"algebraic_proof","duration_days":1,"required_equipment":[],"required_reagents":[],"questions":[],"rationale":"Test."}}'
# β {"observation":{...},"reward":0.0,"done":false,"info":{...}}
```
With optional hosted-model secrets:
```bash
docker run -p 7860:7860 \
-e MODEL_API_KEY=replace-me \
replicalab
```
---
## Hugging Face Spaces Deployment
### What is already configured (API 09)
The repo is now deployment-ready for HF Spaces:
- **Root `Dockerfile`** β HF Spaces requires the Dockerfile at repo root.
The root-level `Dockerfile` is identical to `server/Dockerfile`. Keep them
in sync, or delete `server/Dockerfile` once the team standardizes.
- **`README.md` frontmatter** β The root README now contains the required
YAML frontmatter that HF Spaces parses on push:
```yaml
---
title: ReplicaLab
emoji: π§ͺ
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
```
- **Non-root user** β The Dockerfile creates and runs as `appuser` (UID 1000),
which HF Spaces requires for security.
- **Port 7860** β Both the `EXPOSE` directive and the `uvicorn` CMD use 7860,
matching the `app_port` in the frontmatter.
### Step-by-step deployment (for Max)
#### 1. Create the Space
1. Go to https://huggingface.co/new-space
2. Fill in:
- **Owner:** your HF username or the team org
- **Space name:** `replicalab` (or `replicalab-demo`)
- **License:** MIT
- **SDK:** Docker
- **Hardware:** CPU Basic (free tier is fine for the server)
- **Visibility:** Public
3. Click **Create Space**
#### 2. Add the Space as a git remote
```bash
# From the repo root
git remote add hf https://huggingface.co/spaces/<YOUR_HF_USERNAME>/replicalab
# If the org is different:
# git remote add hf https://huggingface.co/spaces/<ORG>/replicalab
```
#### 3. Push the repo
```bash
# Push the current branch to the Space
git push hf ayush:main
# Or if deploying from master:
# git push hf master:main
```
HF Spaces will automatically detect the `Dockerfile`, build the image, and
start the container.
#### 4. Monitor the build
1. Go to https://huggingface.co/spaces/\<YOUR_HF_USERNAME\>/replicalab
2. Click the **Logs** tab (or **Build** tab during first deploy)
3. Wait for the build to complete (typically 2-5 minutes)
4. The Space status should change from "Building" to "Running"
#### 5. Verify the deployment (API 10 scope)
Once the Space is running:
```bash
# Health check
curl https://ayushozha-replicalab.hf.space/health
# Reset an episode
curl -X POST https://ayushozha-replicalab.hf.space/reset \
-H "Content-Type: application/json" \
-d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'
# List scenarios
curl https://ayushozha-replicalab.hf.space/scenarios
```
WebSocket test (using websocat or wscat):
```bash
wscat -c wss://ayushozha-replicalab.hf.space/ws
# Then type: {"type": "ping"}
# Expect: {"type": "pong"}
```
### Verified live deployment (API 10 sign-off, 2026-03-08)
**Public Space URL:** https://huggingface.co/spaces/ayushozha/replicalab
**API base URL:** `https://ayushozha-replicalab.hf.space`
All four endpoints verified against the live Space with real env:
```
GET /health β 200 {"status":"ok","env":"real"}
GET /scenarios β 200 {"scenarios":[...3 families...]}
POST /reset β 200 {"session_id":"...","episode_id":"...","observation":{...}}
POST /step β 200 {"reward":2.312798,"done":true,"info":{"verdict":"accept",...}}
```
Full episode verified: reset β propose_protocol β accept β terminal reward
with real judge scoring (rigor=0.465, feasibility=1.000, fidelity=0.325,
total_reward=2.313, verdict=accept).
---
## Secrets and API Key Management (API 17)
### Current state
The server is **fully self-contained with no external API calls**.
No secrets or API keys are required to run the environment, judge, or
scoring pipeline. All reward computation is deterministic and local.
### Where secrets live (by context)
| Context | Location | What to set | Required? |
|---------|----------|-------------|-----------|
| **HF Space** | Space Settings β Repository secrets | Nothing currently | No |
| **Local dev** | Shell env vars or `.env` file (gitignored) | Nothing currently | No |
| **Docker** | `-e KEY=value` flags on `docker run` | Nothing currently | No |
| **Colab notebook** | `google.colab.userdata` or env vars | `HF_TOKEN` for model downloads, `REPLICALAB_URL` for hosted env | Yes for training |
### Colab notebook secrets
When running the training notebook, the following are needed:
| Secret | Purpose | Where to set | Required? |
|--------|---------|-------------|-----------|
| `HF_TOKEN` | Download gated models (Qwen3-4B) from HF Hub | Colab Secrets panel (key icon) | Yes |
| `REPLICALAB_URL` | URL of the hosted environment | Hardcode or Colab secret | Optional β defaults to `https://ayushozha-replicalab.hf.space` |
To set in Colab:
1. Click the key icon in the left sidebar
2. Add `HF_TOKEN` with your Hugging Face access token
3. Access in code:
```python
from google.colab import userdata
hf_token = userdata.get("HF_TOKEN")
```
### Future secrets (not currently needed)
If a frontier hosted evaluator is added later:
| Secret name | Purpose | Required? |
|-------------|---------|-----------|
| `MODEL_API_KEY` | Hosted evaluator access key | Only if a hosted evaluator is added |
| `MODEL_BASE_URL` | Alternate provider endpoint | Only if using a proxy |
These would be set in HF Space Settings β Repository secrets, and
accessed via `os.environ.get("MODEL_API_KEY")` in server code.
### Re-deploying after code changes
```bash
# Just push again β HF rebuilds automatically
git push hf ayush:main
```
To force a full rebuild (e.g. after dependency changes):
1. Go to Space **Settings**
2. Click **Factory reboot** under the Danger zone section
### Known limitations
- **Free CPU tier** has 2 vCPU and 16 GB RAM. This is sufficient for the
FastAPI server but NOT for running RL training. Training happens in Colab.
- **Cold starts** β Free-tier Spaces sleep after 48 hours of inactivity.
The first request after sleep takes 30-60 seconds to rebuild.
- **Persistent storage** β Episode replays and logs are in-memory only.
They reset when the container restarts. This is acceptable for the
hackathon demo.
- **Heavy hosted models require billing-enabled hardware** β as of
2026-03-09, the checked HF token authenticates successfully but the backing
account reports `canPay=false` and has no org attached, so it is currently
suitable for model downloads but not for provisioning paid large-model
serving through HF Spaces hardware or Inference Endpoints.
---
## Environment URLs Reference
| Service | Local | Hosted |
|---------|-------|--------|
| FastAPI app | `http://localhost:7860` | `https://ayushozha-replicalab.hf.space` |
| Health | `http://localhost:7860/health` | `https://ayushozha-replicalab.hf.space/health` |
| WebSocket | `ws://localhost:7860/ws` | `wss://ayushozha-replicalab.hf.space/ws` |
| Scenarios | `http://localhost:7860/scenarios` | `https://ayushozha-replicalab.hf.space/scenarios` |
---
## Northflank CLI Access
### Local verification (2026-03-08)
- Installed globally with `npm i -g @northflank/cli`
- Verified locally with `northflank --version`
- Current verified version: `0.10.16`
### Login
```bash
northflank login -n <context-name> -t <token>
```
`<token>` must come from the user's Northflank account or team secret
manager. Do not commit it to the repo.
### Service access commands for `replica-labs/replicalab-ai`
```bash
northflank forward service --projectId replica-labs --serviceId replicalab-ai
northflank get service logs --tail --projectId replica-labs --serviceId replicalab-ai
northflank ssh service --projectId replica-labs --serviceId replicalab-ai
northflank exec service --projectId replica-labs --serviceId replicalab-ai
northflank upload service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt
northflank download service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt
```
### Current Northflank runtime findings (2026-03-09)
- The manual training job `replicalab-train` exists in `replica-labs`, but
`northflank start job run --projectId replica-labs --jobId replicalab-train`
currently fails with `409 No deployment configured`.
- The job still has runtime variables configured, including the older remote
`MODEL_NAME=Qwen/Qwen3-8B`, so even after the missing deployment is fixed the
runtime config should be reviewed before launching training.
- The live service `replicalab-ai` is deployed on the same
`nf-gpu-hack-16-64` billing plan, but a direct probe from inside the
container found no `nvidia-smi` binary and no `/dev/nvidia*` device nodes.
Treat GPU/H100 availability as unverified until a container can prove
hardware visibility from inside the runtime.
### Current Northflank notebook findings (2026-03-09)
- There is a separate live notebook service in project `notebook-openport`:
`jupyter-pytorch`.
- The active public notebook DNS is
`app--jupyter-pytorch--9y6g97v7czb9.code.run` on port `8888` (`/lab` for the
Jupyter UI).
- Northflank reports that service with GPU config
`gpuType=h100-80`, `gpuCount=1`, and an in-container probe confirmed
`NVIDIA H100 80GB HBM3`.
- The notebook image is `quay.io/jupyter/pytorch-notebook:cuda12-2025-08-18`.
- The notebook currently contains a repo clone and GRPO outputs, but the saved
notebook/log state is not clean: training produced adapter checkpoints
through step 200, then later notebook evaluation/inference failed with a
`string indices must be integers, not 'str'` content-format error.
### Windows note
Global npm binaries resolve from `C:\Users\ayush\AppData\Roaming\npm` on this
machine. If `northflank` is not found in a new shell, reopen the terminal so
the updated PATH is reloaded.
---
## Hand-off To Ayush
**Local server:**
- WebSocket: `ws://localhost:7860/ws`
- REST health: `http://localhost:7860/health`
- Running against: **real env** (not stub)
**Hosted deployment (verified 2026-03-08):**
- Base URL: `https://ayushozha-replicalab.hf.space`
- `/health` returns `200` with `{"status":"ok","env":"real"}`
- WebSocket path: `wss://ayushozha-replicalab.hf.space/ws`
- Full episode tested: propose β accept β reward with real judge scores
---
## Troubleshooting
| Issue | Fix |
|-------|-----|
| `ReplicaLabEnv not found` warning at startup | The real env is now available; ensure `replicalab/scoring/rubric.py` is present and `httpx` + `websocket-client` are in `server/requirements.txt` |
| Docker build fails | Re-check `server/requirements.txt` and the Docker build context |
| CORS error from the frontend | Re-check allowed origins in `server/app.py` |
| WebSocket closes after idle time | Send periodic ping messages or reconnect |
| Session not found (REST) | Call `/reset` again to create a new session |
|