vocal-mirror / build-errors /build_errors.md
rubentuesday's picture
docs: log iterations 14-16 (full backend migration, middleware fix)
71b7030
# HF Space Build Error Log β€” rubentuesday/vocal-mirror
This file is committed alongside every fix so the repo retains full context of what broke and why.
---
## Iteration 1 β€” 2026-04-11
**Stage:** CONFIG_ERROR
**Error:** `No candidate PyTorch version found for ZeroGPU`
**Root cause:** `requirements.txt` pinned `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` with `--extra-index-url https://download.pytorch.org/whl/cu121`. ZeroGPU manages its own CUDA PyTorch installation and rejects spaces that pin a `+cu121`-suffixed variant β€” it fails at config parse time before any package install.
**Fix applied:**
- Removed `--extra-index-url https://download.pytorch.org/whl/cu121` from `requirements.txt`
- Removed `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` from `requirements.txt` (ZeroGPU provides these)
- Changed `gradio>=5.0.0,<6.0` β†’ `gradio==4.44.1` in `requirements.txt` (project rule: pin to 4.44.1)
- Changed `sdk_version: 5.0.0` β†’ `sdk_version: 4.44.1` in `README.md` YAML frontmatter
**Result:** FAIL β€” CONFIG_ERROR resolved, but caused new RUNTIME_ERROR (see Iteration 2). Gradio 4.44.1 was wrong choice β€” reverted.
---
## Iteration 2 β€” 2026-04-11
**Stage:** RUNTIME_ERROR
**Error 1 (first):** `TypeError: unhashable type: 'dict'` in `jinja2/utils.py` β€” Gradio 4.x Jinja template cache bug
**Error 2:** `ValueError: When localhost is not accessible, a shareable link must be created. Please set share=True` β€” Gradio 4.x requires share=True on remote hosts
**Root cause:** Downgrading to `gradio==4.44.1` reintroduced two known Gradio 4.x bugs. The commit history already shows `c0a2ea8` explicitly upgraded to 5.x to fix the Jinja crash. Both errors are 4.x-only issues fixed in 5.x. The "pin to 4.44.1" instruction in the task brief was outdated.
**Fix applied:**
- Reverted `requirements.txt`: `gradio==4.44.1` β†’ `gradio>=5.0.0,<6.0`
- Reverted `README.md`: `sdk_version: 4.44.1` β†’ `sdk_version: 5.0.0`
**Result:** PASS β€” Space reached RUNNING stage. `/health_hf` returns 308 (route missing). Fixed in Iteration 3.
---
## Iteration 3 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` returns 308 Permanent Redirect (no such route)
**Root cause:** `app.py` only has `demo.launch()` with no custom routes. Gradio 5.x redirects unknown paths to `/`.
**Fix applied:** Switched from `demo.launch()` to `gr.mount_gradio_app()` pattern:
- Added `FastAPI` app with `@app.get("/health_hf")` returning `{"status": "ok"}`
- Replaced `demo.launch()` with `app = gr.mount_gradio_app(app, demo, path="/")`
- `@spaces.GPU` decorator still handles ZeroGPU GPU allocation independently
**Result:** FAIL β€” RUNTIME_ERROR exit code 0. gr.mount_gradio_app() returns immediately; nothing blocks the process. Fixed in Iteration 4.
---
## Iteration 4 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `Exit code: 0. Reason: ` (clean exit, process didn't stay alive)
**Root cause:** `gr.mount_gradio_app()` returns an ASGI app object but doesn't start a server. Without `demo.launch()` blocking, `app.py` runs to completion and exits.
**Fix applied:** Added `uvicorn.run(app, host="0.0.0.0", port=7860)` after the mount call to start the ASGI server and block the process.
**Result:** FAIL β€” RUNTIME_ERROR "No @spaces.GPU function detected during startup". `uvicorn.run()` bypasses `spaces.zero.gradio` launch wrapper that scans for GPU functions. ZeroGPU requires `demo.launch()`. Fixed in Iteration 5.
---
## Iteration 5 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `No @spaces.GPU function detected during startup`
**Root cause:** `gr.mount_gradio_app()` + `uvicorn.run()` bypasses the `spaces.zero.gradio` interceptor of `demo.launch()`. ZeroGPU scans for `@spaces.GPU` decorated functions inside that interceptor β€” never gets called, so GPU functions aren't registered.
**Fix applied:** Reverted to bare `demo.launch()`. Added `/health_hf` by monkey-patching `gradio.routes.App.create_app` to inject the route into the Gradio FastAPI app at creation time, before ZeroGPU starts the server.
**Result:** FAIL β€” "Application unable to start for an unknown reason". The `create_app.__func__` access likely failed (AttributeError or TypeError) in Gradio 5.x, crashing startup silently. Fixed in Iteration 6.
---
## Iteration 6 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” "Application unable to start for an unknown reason"
**Root cause:** Monkey-patching `gradio.routes.App.create_app.__func__` crashed at import/startup time in Gradio 5.x. The `__func__` access pattern assumes `create_app` is a classmethod β€” if the signature or descriptor changed in 5.x, this raises AttributeError and kills the process before any server starts.
**Fix applied:** Replaced monkey-patch with a daemon thread that polls `demo.server` (set by Gradio after `demo.launch()` initializes the server) and injects `/health_hf` once available. `demo.launch()` stays bare β€” ZeroGPU detection works normally. Thread is a no-op if injection fails.
**Result:** FAIL β€” Space is RUNNING but `/health_hf` still returns 308. `demo.server` is never set in the polling thread's context (ZeroGPU runs the real server in a GPU worker, not the same process). Fixed in Iteration 7.
---
## Iteration 7 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` still returns 308
**Root cause:** In ZeroGPU, the actual Gradio server runs in a separate GPU worker process. `demo.server` is never set in the main process, so the daemon thread's poll always fails and the route is never injected.
**Fix applied:** Use `demo.launch(prevent_thread_lock=True)` β€” the spaces interceptor still detects `@spaces.GPU` functions, then starts the server in a background thread in the same process and returns. After `launch()` returns, `demo.server.app` is accessible and we add `/health_hf`. Main thread blocked via `threading.Event().wait()` (avoids relying on `demo.block_thread()` existing in Gradio 5.x).
**Result:** FAIL β€” `AttributeError: 'Server' object has no attribute 'app'`. Gradio 5.x's `Server` wraps uvicorn β€” the FastAPI app lives at `server.config.app`, not `server.app`. Fixed in Iteration 8.
---
## Iteration 8 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `AttributeError: 'Server' object has no attribute 'app'`
**Root cause:** `demo.server` is a Gradio `Server` (wrapping uvicorn). In uvicorn, the ASGI app is stored in `server.config.app` (the `Config` object passed at construction), not directly on `server.app`.
**Fix applied:** Changed `demo.server.app.get(...)` β†’ `demo.server.config.app.get(...)`.
**Result:** FAIL β€” Space RUNNING but `/health_hf` still 308. `demo.server.config.app.get()` adds route AFTER Gradio's catch-all `/{path_name:path}` is already registered. FastAPI matches routes in insertion order β€” catch-all added first wins. Fixed in Iteration 9.
---
## Iteration 9 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` returns 308
**Root cause:** Adding `@app.get("/health_hf")` after `create_app` appends the route AFTER Gradio's catch-all `/{path_name:path}`. FastAPI/Starlette matches routes in registration order β€” the catch-all was registered first and intercepts everything, including `/health_hf`.
**Fix applied:** Use Starlette middleware (`BaseHTTPMiddleware`) patched into Gradio's `create_app`. Middleware runs BEFORE any route matching, so `/health_hf` is intercepted before the catch-all. Reverted to bare `demo.launch()` (ZeroGPU works). Entire patch wrapped in `try/except` so failures are silent and don't prevent startup.
**Result:** PASS βœ“ β€” Space RUNNING, `GET /health_hf` β†’ `{"status":"ok"}` HTTP 200. All done after 9 iterations.
---
## Iteration 10 β€” 2026-04-12
**Stage:** RUNNING but "Run Benchmark" throws OSError
**Error:** `OSError: Could not load this library: /usr/local/lib/python3.10/site-packages/torchaudio/lib/_torchaudio.abi3.so`
**Root cause (via runtime logs):** `qwen-tts` depends on `torchaudio`. `pip install qwen-tts` upgraded `torchaudio` to the latest PyPI release which was compiled against CUDA 13 (`libcudart.so.13`). ZeroGPU A10G runs CUDA 12, so `libcudart.so.13` is not present. Full import chain: `from qwen_tts import Qwen3TTSModel` β†’ `speech_vq.py` β†’ `import torchaudio.compliance.kaldi` β†’ `torchaudio/__init__.py` β†’ `torchaudio._extension` β†’ `torch.ops.load_library("_torchaudio.abi3.so")` β†’ `OSError: libcudart.so.13`.
**Fix applied:** Pinned `torchaudio==2.5.1` in `requirements.txt` BEFORE the `qwen-tts` line. torchaudio 2.5.1 (Nov 2024) was compiled against CUDA 12 and prevents pip from upgrading to a CUDA-13 version. `kaldi.fbank()` (the only torchaudio function qwen-tts calls from this path) is a CPU-only DSP operation β€” no GPU needed.
**Result:** PASS βœ“ β€” Space RUNNING with new SHA 990b408, `/health_hf` β†’ 200. Benchmark fix deployed.
---
## Iteration 11 β€” 2026-04-12
**Stage:** RUNNING β€” benchmark redesign (not a build error)
**Change:** Replaced static `np.zeros` reference + arbitrary test text with a live microphone enrollment simulation. New UI: user records one of the 3 frontend enrollment phrases via Gradio `Audio` input, benchmark clones their voice and synthesizes an AI response ("Great job! Now let's keep the conversation going. How was your day?"), returns RTF result + playable audio output. Mirrors the actual frontend UX: enroll β†’ clone β†’ hear AI response.
**Files changed:** `app.py` only.
**Result:** FAIL β€” space RUNNING, `/health_hf` 200, but Gradio API returns 500 Internal Server Error. UI loads but "Start β†’" button fails. See Iteration 12.
---
## Iteration 12 β€” 2026-04-13
**Stage:** RUNNING but Gradio API `/gradio_api/info` returns 500 Internal Server Error
**Error:** `File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 967, in _json_schema_to_python_type` β€” crash during API schema generation
**Root cause (via runtime logs):** Gradio generates a JSON schema for all function signatures when serving `/gradio_api/info`. The `gpu_chat_turn` function had type hints `ref: np.ndarray, history: list, turn_count: int, l1: str, l2: str`. `gradio_client`'s `json_schema_to_python_type` in `_json_schema_to_python_type` cannot serialize `numpy.ndarray` into a JSON schema β€” it crashes on the list comprehension at line 967–968 trying to build property descriptions. This crash propagates through Starlette's middleware stack, resulting in a 500 on every request (including the frontend's queue/event polling calls).
**Fix applied:**
- Removed all type hints from `gpu_enroll_and_greet` and `gpu_chat_turn` signatures β€” Gradio's schema generator only inspects annotated parameters
- Changed `gpu_enroll_and_greet` to return `ref.tolist()` (plain Python list) instead of `np.ndarray` β€” keeps State JSON-serializable
- Changed `gpu_chat_turn` to accept `ref_list` (plain list) and convert to `np.ndarray` internally via `np.array(ref_list, dtype=np.float32)` before passing to `synthesize()`
- No changes to callbacks β€” `on_enroll` stores whatever the function returns; `on_send` passes it through unchanged
**Files changed:** `app.py` only.
**Result:** FAIL β€” same crash persists. Removing np.ndarray type hints did not resolve it. Root cause was actually the gr.State(dict) itself, not the function signature. See Iteration 13.
---
## Iteration 13 β€” 2026-04-13
**Stage:** RUNNING but `/gradio_api/info` still returns 500
**Error:** `TypeError: argument of type 'bool' is not iterable` at `gradio_client/utils.py:882 β†’ get_type β†’ if "const" in schema`
**Root cause:** Removing np.ndarray type hints in Iteration 12 did not fix the crash. The actual source is `gr.State({"l1": "en", "l2": "es", "ref": None, "history": [], "turn_count": 0})`. When Gradio generates the API schema for this State, it calls `_json_schema_to_python_type` on the dict schema. The dict's JSON Schema representation has `additionalProperties: True` (a Python bool, per JSON Schema spec). The schema generator then does `if "const" in schema` where `schema` is already a Python bool `True`, causing `TypeError: argument of type 'bool' is not iterable`. This happens in `gradio_client/utils.py` at line 882 regardless of function type hints β€” it's triggered by the State type itself.
**Fix applied:** Replaced single `gr.State(dict)` with **5 flat, primitive `gr.State` objects**:
- `state_l1 = gr.State("en")` β€” string, safe
- `state_l2 = gr.State("es")` β€” string, safe
- `state_ref = gr.State([])` β€” empty list (no numpy), safe
- `state_history = gr.State([])` β€” list of dicts (plain JSON), safe
- `state_turn_count = gr.State(0)` β€” int, safe
All callbacks updated to accept/return these flat states. `ref_list` (a Python list) is passed as `state_ref` and converted to `np.ndarray` inside `gpu_chat_turn` only. Full `app.py` rewrite.
**Files changed:** `app.py` only.
**Result:** PASS βœ“ β€” Space RUNNING, Gradio UI fully functional (language select β†’ enrollment β†’ chat β†’ wall at turn 7), `/health_hf` β†’ 200. See session 2026-04-13 for subsequent full-backend migration.
---
## Iteration 14 β€” 2026-04-13 (session 2)
**Stage:** Full backend migration attempt β€” `gr.mount_gradio_app()` approach
**Goal:** Serve FastAPI REST API (all `/session/*` endpoints) alongside Gradio UI so the Vercel React frontend can talk directly to the HF Space instead of Railway.
**Approach:** Replaced `demo.launch()` with `app = gr.mount_gradio_app(api, demo, path="/ui")` where `api` is a standalone `FastAPI()` instance with all endpoints defined as routes.
**Error:** `RUNTIME_ERROR` β€” Space exits with code 0 (clean exit).
**Root cause:** HF Spaces with `sdk: gradio` require `demo.launch()` to start and block the server. `gr.mount_gradio_app()` returns an ASGI app object but does not start a server β€” same as Iteration 4 (the process runs to completion and exits immediately).
**Fix applied:** See Iteration 15.
---
## Iteration 15 β€” 2026-04-13 (session 2)
**Stage:** RUNTIME_ERROR β€” Space exits code 0 after `gr.mount_gradio_app()`
**Approach:** Switched to `include_router()` pattern: patched `gradio.routes.App.create_app` to call `gapp.include_router(_vmr)` (adding all API routes to Gradio's internal FastAPI app), then ended with `demo.launch()` to keep the process alive.
**Error:** `GET /health` β†’ `HTTP 308 Permanent Redirect` (location: `/`). All API routes return 308.
**Root cause:** Gradio 5.x registers a catch-all SPA route `/{path_name:path}` during `create_app`. FastAPI matches routes in insertion order β€” the catch-all is registered first (inside Gradio's own `create_app` logic), so any routes added afterward via `include_router()` are never matched. Every unknown path gets 308-redirected to `/` before our routes are evaluated.
**Key lesson:** `include_router()` appends routes AFTER the catch-all β€” they will never be reached in Gradio 5.x.
**Fix applied:** See Iteration 16.
---
## Iteration 16 β€” 2026-04-13 (session 2)
**Stage:** RUNNING but all API routes return 308 via `include_router()`
**Root cause:** Same Gradio 5.x SPA catch-all issue as Iteration 9 but for custom routes instead of `/health_hf`. `include_router()` is append-only and cannot insert before the catch-all.
**Fix applied:** Implemented all REST API endpoints as a single `BaseHTTPMiddleware` subclass (`_VocalMirrorAPI`) with regex-based path dispatch. Middleware runs BEFORE any route matching (same pattern that fixed `/health_hf` in Iteration 9). `demo.launch()` stays bare. Session state in-memory dict, audio in `/tmp/`, background thread for enrollment, `asyncio.run_in_executor` for `gpu_tts()` from async context.
**Result:** PASS βœ“
- `GET /health` β†’ `{"status":"ok"}` HTTP 200 βœ“
- `GET /vm-config` β†’ `{"wall_turn_count":7}` βœ“ (named `/vm-config` to avoid shadowing Gradio's own `/config`)
- `POST /session/start` β†’ returns session_id + word_list βœ“
- `GET /session/{id}/wall_status` β†’ `{"show_wall":false,"turn_count":0}` βœ“
- Gradio product UI at `/` still fully functional βœ“
Space is RUNNING on zero-a10g with SHA ef1ba6a.