File size: 5,127 Bytes
3dbff85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# Contributing

Riprap is the hackathon submission for the AMD Γ— lablab.ai
Developer Hackathon, but the source ships under Apache 2.0 and is
intended to be reusable as a template for citation-grounded civic
AI in any flood-vulnerable region. Pull requests welcome.

## Quickstart

Python 3.12 + `uv`:

```bash
git clone https://github.com/msradam/riprap-nyc
cd riprap-nyc
uv venv && uv pip install -r requirements.txt
```

SvelteKit (the build is committed; only rebuild when sources
change under `web/sveltekit/src`):

```bash
cd web/sveltekit && npm ci && npm run build && cd ../..
```

Run the dev server locally pointing at the production inference
Space (real Granite + EO models, real NVML energy readings):

```bash
RIPRAP_LLM_PRIMARY=vllm \
RIPRAP_LLM_BASE_URL=https://msradam-riprap-vllm.hf.space/v1 \
RIPRAP_LLM_API_KEY=<token> \
RIPRAP_ML_BACKEND=remote \
RIPRAP_ML_BASE_URL=https://msradam-riprap-vllm.hf.space \
RIPRAP_ML_API_KEY=<token> \
.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860
```

Or run pure-local with Ollama (no GPU readings; data-sheet estimate):

```bash
ollama pull granite4.1:3b granite4.1:8b
.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860
```

## Verifying changes

Two probe scripts exercise the live deployment end-to-end:

```bash
# All five Stones must fire on the canonical address; emissions
# block must carry nvidia_l4 hardware; no torchvision/terratorch
# dep regressions in the trace.
PYTHONPATH=. uv run python scripts/probe_stones_fire.py --timeout 600

# Full canonical suite β€” five NYC addresses, intent-aware checks,
# Mellea grounding budget, no specialist crashes.
.venv/bin/python scripts/probe_addresses.py \
    --base https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space
```

Both default to the lablab UI Space; pass `--base http://127.0.0.1:7860`
to hit a local server.

## Structure

```
app/                       Python package β€” the FSM and its specialists
β”œβ”€β”€ fsm.py                 Burr FSM, one @action per probe
β”œβ”€β”€ llm.py                 LiteLLM Router shim (Ollama / vLLM)
β”œβ”€β”€ inference.py           HTTP client for the riprap-models service
β”œβ”€β”€ emissions.py           Per-query energy + token tracker
β”œβ”€β”€ stones/                Stone taxonomy (NAME / TAGLINE / collect())
β”œβ”€β”€ flood_layers/          Cornerstone probes (sandy, dep, microtopo, …)
β”œβ”€β”€ context/               Keystone + Touchstone register + EO probes
β”œβ”€β”€ live/                  Lodestone forecast probes
β”œβ”€β”€ intents/               single_address / neighborhood / compare / live_now
β”œβ”€β”€ reconcile.py           Capstone β€” Granite-native document reconcile
└── mellea_validator.py    Mellea four-check rejection sampling

web/                       FastAPI + SvelteKit
β”œβ”€β”€ main.py                FastAPI app, SSE streaming, layer endpoints
β”œβ”€β”€ sveltekit/             Primary UI (adapter-static; build committed)
└── static/                Legacy custom-element pages (still mounted)

inference-vllm/            Inference Space source (vLLM + EO models + proxy)
β”œβ”€β”€ Dockerfile             L4 image, bakes Granite 4.1 8B FP8 + EO deps
β”œβ”€β”€ entrypoint.sh          Boots vllm, riprap-models, proxy as subprocesses
└── proxy.py               Bearer-auth + NVML power sampler + SSE pass-through

inference/                 Ollama-backed inference Space (fallback variant)
services/riprap-models/    The EO/forecast specialist HTTP service

scripts/
β”œβ”€β”€ probe_stones_fire.py   Programmatic Stone-fire CI
β”œβ”€β”€ probe_addresses.py     Canonical 5-address suite
β”œβ”€β”€ deploy_vllm_space.sh   Deploy the L4 inference Space
β”œβ”€β”€ deploy_personal_space.sh  Deploy the personal L4 mirror
β”œβ”€β”€ deploy_inference_space.sh Deploy the Ollama-backed inference Space
└── …                       Register builders, raster bakers, etc.

experiments/               Reproduction recipes for the three NYC fine-tunes
docs/                      Architecture, methodology, deploy, emissions, runbooks
tests/                     pytest suite (envelope + compare-shape tests)
```

## Style

- Python 3.12; `uv` for package management.
- LLM calls go through `app/llm.py` β€” never import `litellm` /
  `ollama` directly from a specialist. The `chat()` shim wraps both
  backends and the energy ledger reads off it.
- Remote ML calls go through `app/inference.py::_post`. Specialists
  may try local fallback only when `inference.remote_enabled()` is
  False; once a remote call has been attempted, return a clean
  `{ok: False, skipped: ...}` on failure rather than crashing
  through to local code paths that may not be installed.
- Every specialist emits one trace record per call with `step` /
  `ok` / `elapsed_s` / `result` / `err` so the SSE stream and the
  emissions tracker can reason about it.

## Reporting issues

GitHub issues at <https://github.com/msradam/riprap-nyc/issues>.
For hackathon-period demo issues during May 4–10 2026, the live
deploy at
<https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space>
is the source of truth.