Spaces:

rubentuesday
/

vocal-mirror

Sleeping

App Files Files Community

vocal-mirror / build-errors /build_errors.md

rubentuesday

docs: log iterations 14-16 (full backend migration, middleware fix)

71b7030 about 2 months ago

preview code

raw

history blame contribute delete

16.1 kB

	# HF Space Build Error Log — rubentuesday/vocal-mirror

	This file is committed alongside every fix so the repo retains full context of what broke and why.

	---

	## Iteration 1 — 2026-04-11
	Stage: CONFIG_ERROR
	Error: `No candidate PyTorch version found for ZeroGPU`
	Root cause: `requirements.txt` pinned `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` with `--extra-index-url https://download.pytorch.org/whl/cu121`. ZeroGPU manages its own CUDA PyTorch installation and rejects spaces that pin a `+cu121`-suffixed variant — it fails at config parse time before any package install.
	Fix applied:
	- Removed `--extra-index-url https://download.pytorch.org/whl/cu121` from `requirements.txt`
	- Removed `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` from `requirements.txt` (ZeroGPU provides these)
	- Changed `gradio>=5.0.0,<6.0` → `gradio==4.44.1` in `requirements.txt` (project rule: pin to 4.44.1)
	- Changed `sdk_version: 5.0.0` → `sdk_version: 4.44.1` in `README.md` YAML frontmatter
	Result: FAIL — CONFIG_ERROR resolved, but caused new RUNTIME_ERROR (see Iteration 2). Gradio 4.44.1 was wrong choice — reverted.

	---

	## Iteration 2 — 2026-04-11
	Stage: RUNTIME_ERROR
	Error 1 (first): `TypeError: unhashable type: 'dict'` in `jinja2/utils.py` — Gradio 4.x Jinja template cache bug
	Error 2: `ValueError: When localhost is not accessible, a shareable link must be created. Please set share=True` — Gradio 4.x requires share=True on remote hosts
	Root cause: Downgrading to `gradio==4.44.1` reintroduced two known Gradio 4.x bugs. The commit history already shows `c0a2ea8` explicitly upgraded to 5.x to fix the Jinja crash. Both errors are 4.x-only issues fixed in 5.x. The "pin to 4.44.1" instruction in the task brief was outdated.
	Fix applied:
	- Reverted `requirements.txt`: `gradio==4.44.1` → `gradio>=5.0.0,<6.0`
	- Reverted `README.md`: `sdk_version: 4.44.1` → `sdk_version: 5.0.0`
	Result: PASS — Space reached RUNNING stage. `/health_hf` returns 308 (route missing). Fixed in Iteration 3.

	---

	## Iteration 3 — 2026-04-11
	Stage: RUNNING but `/health_hf` returns 308 Permanent Redirect (no such route)
	Root cause: `app.py` only has `demo.launch()` with no custom routes. Gradio 5.x redirects unknown paths to `/`.
	Fix applied: Switched from `demo.launch()` to `gr.mount_gradio_app()` pattern:
	- Added `FastAPI` app with `@app.get("/health_hf")` returning `{"status": "ok"}`
	- Replaced `demo.launch()` with `app = gr.mount_gradio_app(app, demo, path="/")`
	- `@spaces.GPU` decorator still handles ZeroGPU GPU allocation independently
	Result: FAIL — RUNTIME_ERROR exit code 0. gr.mount_gradio_app() returns immediately; nothing blocks the process. Fixed in Iteration 4.

	---

	## Iteration 4 — 2026-04-11
	Stage: RUNTIME_ERROR — `Exit code: 0. Reason: ` (clean exit, process didn't stay alive)
	Root cause: `gr.mount_gradio_app()` returns an ASGI app object but doesn't start a server. Without `demo.launch()` blocking, `app.py` runs to completion and exits.
	Fix applied: Added `uvicorn.run(app, host="0.0.0.0", port=7860)` after the mount call to start the ASGI server and block the process.
	Result: FAIL — RUNTIME_ERROR "No @spaces.GPU function detected during startup". `uvicorn.run()` bypasses `spaces.zero.gradio` launch wrapper that scans for GPU functions. ZeroGPU requires `demo.launch()`. Fixed in Iteration 5.

	---

	## Iteration 5 — 2026-04-11
	Stage: RUNTIME_ERROR — `No @spaces.GPU function detected during startup`
	Root cause: `gr.mount_gradio_app()` + `uvicorn.run()` bypasses the `spaces.zero.gradio` interceptor of `demo.launch()`. ZeroGPU scans for `@spaces.GPU` decorated functions inside that interceptor — never gets called, so GPU functions aren't registered.
	Fix applied: Reverted to bare `demo.launch()`. Added `/health_hf` by monkey-patching `gradio.routes.App.create_app` to inject the route into the Gradio FastAPI app at creation time, before ZeroGPU starts the server.
	Result: FAIL — "Application unable to start for an unknown reason". The `create_app.__func__` access likely failed (AttributeError or TypeError) in Gradio 5.x, crashing startup silently. Fixed in Iteration 6.

	---

	## Iteration 6 — 2026-04-11
	Stage: RUNTIME_ERROR — "Application unable to start for an unknown reason"
	Root cause: Monkey-patching `gradio.routes.App.create_app.__func__` crashed at import/startup time in Gradio 5.x. The `__func__` access pattern assumes `create_app` is a classmethod — if the signature or descriptor changed in 5.x, this raises AttributeError and kills the process before any server starts.
	Fix applied: Replaced monkey-patch with a daemon thread that polls `demo.server` (set by Gradio after `demo.launch()` initializes the server) and injects `/health_hf` once available. `demo.launch()` stays bare — ZeroGPU detection works normally. Thread is a no-op if injection fails.
	Result: FAIL — Space is RUNNING but `/health_hf` still returns 308. `demo.server` is never set in the polling thread's context (ZeroGPU runs the real server in a GPU worker, not the same process). Fixed in Iteration 7.

	---

	## Iteration 7 — 2026-04-11
	Stage: RUNNING but `/health_hf` still returns 308
	Root cause: In ZeroGPU, the actual Gradio server runs in a separate GPU worker process. `demo.server` is never set in the main process, so the daemon thread's poll always fails and the route is never injected.
	Fix applied: Use `demo.launch(prevent_thread_lock=True)` — the spaces interceptor still detects `@spaces.GPU` functions, then starts the server in a background thread in the same process and returns. After `launch()` returns, `demo.server.app` is accessible and we add `/health_hf`. Main thread blocked via `threading.Event().wait()` (avoids relying on `demo.block_thread()` existing in Gradio 5.x).
	Result: FAIL — `AttributeError: 'Server' object has no attribute 'app'`. Gradio 5.x's `Server` wraps uvicorn — the FastAPI app lives at `server.config.app`, not `server.app`. Fixed in Iteration 8.

	---

	## Iteration 8 — 2026-04-11
	Stage: RUNTIME_ERROR — `AttributeError: 'Server' object has no attribute 'app'`
	Root cause: `demo.server` is a Gradio `Server` (wrapping uvicorn). In uvicorn, the ASGI app is stored in `server.config.app` (the `Config` object passed at construction), not directly on `server.app`.
	Fix applied: Changed `demo.server.app.get(...)` → `demo.server.config.app.get(...)`.
	Result: FAIL — Space RUNNING but `/health_hf` still 308. `demo.server.config.app.get()` adds route AFTER Gradio's catch-all `/{path_name:path}` is already registered. FastAPI matches routes in insertion order — catch-all added first wins. Fixed in Iteration 9.

	---

	## Iteration 9 — 2026-04-11
	Stage: RUNNING but `/health_hf` returns 308
	Root cause: Adding `@app.get("/health_hf")` after `create_app` appends the route AFTER Gradio's catch-all `/{path_name:path}`. FastAPI/Starlette matches routes in registration order — the catch-all was registered first and intercepts everything, including `/health_hf`.
	Fix applied: Use Starlette middleware (`BaseHTTPMiddleware`) patched into Gradio's `create_app`. Middleware runs BEFORE any route matching, so `/health_hf` is intercepted before the catch-all. Reverted to bare `demo.launch()` (ZeroGPU works). Entire patch wrapped in `try/except` so failures are silent and don't prevent startup.
	Result: PASS ✓ — Space RUNNING, `GET /health_hf` → `{"status":"ok"}` HTTP 200. All done after 9 iterations.

	---

	## Iteration 10 — 2026-04-12
	Stage: RUNNING but "Run Benchmark" throws OSError
	Error: `OSError: Could not load this library: /usr/local/lib/python3.10/site-packages/torchaudio/lib/_torchaudio.abi3.so`
	Root cause (via runtime logs): `qwen-tts` depends on `torchaudio`. `pip install qwen-tts` upgraded `torchaudio` to the latest PyPI release which was compiled against CUDA 13 (`libcudart.so.13`). ZeroGPU A10G runs CUDA 12, so `libcudart.so.13` is not present. Full import chain: `from qwen_tts import Qwen3TTSModel` → `speech_vq.py` → `import torchaudio.compliance.kaldi` → `torchaudio/__init__.py` → `torchaudio._extension` → `torch.ops.load_library("_torchaudio.abi3.so")` → `OSError: libcudart.so.13`.
	Fix applied: Pinned `torchaudio==2.5.1` in `requirements.txt` BEFORE the `qwen-tts` line. torchaudio 2.5.1 (Nov 2024) was compiled against CUDA 12 and prevents pip from upgrading to a CUDA-13 version. `kaldi.fbank()` (the only torchaudio function qwen-tts calls from this path) is a CPU-only DSP operation — no GPU needed.
	Result: PASS ✓ — Space RUNNING with new SHA 990b408, `/health_hf` → 200. Benchmark fix deployed.

	---

	## Iteration 11 — 2026-04-12
	Stage: RUNNING — benchmark redesign (not a build error)
	Change: Replaced static `np.zeros` reference + arbitrary test text with a live microphone enrollment simulation. New UI: user records one of the 3 frontend enrollment phrases via Gradio `Audio` input, benchmark clones their voice and synthesizes an AI response ("Great job! Now let's keep the conversation going. How was your day?"), returns RTF result + playable audio output. Mirrors the actual frontend UX: enroll → clone → hear AI response.
	Files changed: `app.py` only.
	Result: FAIL — space RUNNING, `/health_hf` 200, but Gradio API returns 500 Internal Server Error. UI loads but "Start →" button fails. See Iteration 12.

	---

	## Iteration 12 — 2026-04-13
	Stage: RUNNING but Gradio API `/gradio_api/info` returns 500 Internal Server Error
	Error: `File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 967, in _json_schema_to_python_type` — crash during API schema generation
	Root cause (via runtime logs): Gradio generates a JSON schema for all function signatures when serving `/gradio_api/info`. The `gpu_chat_turn` function had type hints `ref: np.ndarray, history: list, turn_count: int, l1: str, l2: str`. `gradio_client`'s `json_schema_to_python_type` in `_json_schema_to_python_type` cannot serialize `numpy.ndarray` into a JSON schema — it crashes on the list comprehension at line 967–968 trying to build property descriptions. This crash propagates through Starlette's middleware stack, resulting in a 500 on every request (including the frontend's queue/event polling calls).
	Fix applied:
	- Removed all type hints from `gpu_enroll_and_greet` and `gpu_chat_turn` signatures — Gradio's schema generator only inspects annotated parameters
	- Changed `gpu_enroll_and_greet` to return `ref.tolist()` (plain Python list) instead of `np.ndarray` — keeps State JSON-serializable
	- Changed `gpu_chat_turn` to accept `ref_list` (plain list) and convert to `np.ndarray` internally via `np.array(ref_list, dtype=np.float32)` before passing to `synthesize()`
	- No changes to callbacks — `on_enroll` stores whatever the function returns; `on_send` passes it through unchanged
	Files changed: `app.py` only.
	Result: FAIL — same crash persists. Removing np.ndarray type hints did not resolve it. Root cause was actually the gr.State(dict) itself, not the function signature. See Iteration 13.

	---

	## Iteration 13 — 2026-04-13
	Stage: RUNNING but `/gradio_api/info` still returns 500
	Error: `TypeError: argument of type 'bool' is not iterable` at `gradio_client/utils.py:882 → get_type → if "const" in schema`
	Root cause: Removing np.ndarray type hints in Iteration 12 did not fix the crash. The actual source is `gr.State({"l1": "en", "l2": "es", "ref": None, "history": [], "turn_count": 0})`. When Gradio generates the API schema for this State, it calls `_json_schema_to_python_type` on the dict schema. The dict's JSON Schema representation has `additionalProperties: True` (a Python bool, per JSON Schema spec). The schema generator then does `if "const" in schema` where `schema` is already a Python bool `True`, causing `TypeError: argument of type 'bool' is not iterable`. This happens in `gradio_client/utils.py` at line 882 regardless of function type hints — it's triggered by the State type itself.
	Fix applied: Replaced single `gr.State(dict)` with 5 flat, primitive `gr.State` objects:
	- `state_l1 = gr.State("en")` — string, safe
	- `state_l2 = gr.State("es")` — string, safe
	- `state_ref = gr.State([])` — empty list (no numpy), safe
	- `state_history = gr.State([])` — list of dicts (plain JSON), safe
	- `state_turn_count = gr.State(0)` — int, safe
	All callbacks updated to accept/return these flat states. `ref_list` (a Python list) is passed as `state_ref` and converted to `np.ndarray` inside `gpu_chat_turn` only. Full `app.py` rewrite.
	Files changed: `app.py` only.
	Result: PASS ✓ — Space RUNNING, Gradio UI fully functional (language select → enrollment → chat → wall at turn 7), `/health_hf` → 200. See session 2026-04-13 for subsequent full-backend migration.

	---

	## Iteration 14 — 2026-04-13 (session 2)
	Stage: Full backend migration attempt — `gr.mount_gradio_app()` approach
	Goal: Serve FastAPI REST API (all `/session/*` endpoints) alongside Gradio UI so the Vercel React frontend can talk directly to the HF Space instead of Railway.
	Approach: Replaced `demo.launch()` with `app = gr.mount_gradio_app(api, demo, path="/ui")` where `api` is a standalone `FastAPI()` instance with all endpoints defined as routes.
	Error: `RUNTIME_ERROR` — Space exits with code 0 (clean exit).
	Root cause: HF Spaces with `sdk: gradio` require `demo.launch()` to start and block the server. `gr.mount_gradio_app()` returns an ASGI app object but does not start a server — same as Iteration 4 (the process runs to completion and exits immediately).
	Fix applied: See Iteration 15.

	---

	## Iteration 15 — 2026-04-13 (session 2)
	Stage: RUNTIME_ERROR — Space exits code 0 after `gr.mount_gradio_app()`
	Approach: Switched to `include_router()` pattern: patched `gradio.routes.App.create_app` to call `gapp.include_router(_vmr)` (adding all API routes to Gradio's internal FastAPI app), then ended with `demo.launch()` to keep the process alive.
	Error: `GET /health` → `HTTP 308 Permanent Redirect` (location: `/`). All API routes return 308.
	Root cause: Gradio 5.x registers a catch-all SPA route `/{path_name:path}` during `create_app`. FastAPI matches routes in insertion order — the catch-all is registered first (inside Gradio's own `create_app` logic), so any routes added afterward via `include_router()` are never matched. Every unknown path gets 308-redirected to `/` before our routes are evaluated.
	Key lesson: `include_router()` appends routes AFTER the catch-all — they will never be reached in Gradio 5.x.
	Fix applied: See Iteration 16.

	---

	## Iteration 16 — 2026-04-13 (session 2)
	Stage: RUNNING but all API routes return 308 via `include_router()`
	Root cause: Same Gradio 5.x SPA catch-all issue as Iteration 9 but for custom routes instead of `/health_hf`. `include_router()` is append-only and cannot insert before the catch-all.
	Fix applied: Implemented all REST API endpoints as a single `BaseHTTPMiddleware` subclass (`_VocalMirrorAPI`) with regex-based path dispatch. Middleware runs BEFORE any route matching (same pattern that fixed `/health_hf` in Iteration 9). `demo.launch()` stays bare. Session state in-memory dict, audio in `/tmp/`, background thread for enrollment, `asyncio.run_in_executor` for `gpu_tts()` from async context.
	Result: PASS ✓
	- `GET /health` → `{"status":"ok"}` HTTP 200 ✓
	- `GET /vm-config` → `{"wall_turn_count":7}` ✓ (named `/vm-config` to avoid shadowing Gradio's own `/config`)
	- `POST /session/start` → returns session_id + word_list ✓
	- `GET /session/{id}/wall_status` → `{"show_wall":false,"turn_count":0}` ✓
	- Gradio product UI at `/` still fully functional ✓
	Space is RUNNING on zero-a10g with SHA ef1ba6a.

	# HF Space Build Error Log — rubentuesday/vocal-mirror

	This file is committed alongside every fix so the repo retains full context of what broke and why.

	---

	## Iteration 1 — 2026-04-11
	Stage: CONFIG_ERROR
	Error: `No candidate PyTorch version found for ZeroGPU`
	Root cause: `requirements.txt` pinned `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` with `--extra-index-url https://download.pytorch.org/whl/cu121`. ZeroGPU manages its own CUDA PyTorch installation and rejects spaces that pin a `+cu121`-suffixed variant — it fails at config parse time before any package install.
	Fix applied:
	- Removed `--extra-index-url https://download.pytorch.org/whl/cu121` from `requirements.txt`
	- Removed `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` from `requirements.txt` (ZeroGPU provides these)
	- Changed `gradio>=5.0.0,<6.0` → `gradio==4.44.1` in `requirements.txt` (project rule: pin to 4.44.1)
	- Changed `sdk_version: 5.0.0` → `sdk_version: 4.44.1` in `README.md` YAML frontmatter
	Result: FAIL — CONFIG_ERROR resolved, but caused new RUNTIME_ERROR (see Iteration 2). Gradio 4.44.1 was wrong choice — reverted.

	---

	## Iteration 2 — 2026-04-11
	Stage: RUNTIME_ERROR
	Error 1 (first): `TypeError: unhashable type: 'dict'` in `jinja2/utils.py` — Gradio 4.x Jinja template cache bug
	Error 2: `ValueError: When localhost is not accessible, a shareable link must be created. Please set share=True` — Gradio 4.x requires share=True on remote hosts
	Root cause: Downgrading to `gradio==4.44.1` reintroduced two known Gradio 4.x bugs. The commit history already shows `c0a2ea8` explicitly upgraded to 5.x to fix the Jinja crash. Both errors are 4.x-only issues fixed in 5.x. The "pin to 4.44.1" instruction in the task brief was outdated.
	Fix applied:
	- Reverted `requirements.txt`: `gradio==4.44.1` → `gradio>=5.0.0,<6.0`
	- Reverted `README.md`: `sdk_version: 4.44.1` → `sdk_version: 5.0.0`
	Result: PASS — Space reached RUNNING stage. `/health_hf` returns 308 (route missing). Fixed in Iteration 3.

	---

	## Iteration 3 — 2026-04-11
	Stage: RUNNING but `/health_hf` returns 308 Permanent Redirect (no such route)
	Root cause: `app.py` only has `demo.launch()` with no custom routes. Gradio 5.x redirects unknown paths to `/`.
	Fix applied: Switched from `demo.launch()` to `gr.mount_gradio_app()` pattern:
	- Added `FastAPI` app with `@app.get("/health_hf")` returning `{"status": "ok"}`
	- Replaced `demo.launch()` with `app = gr.mount_gradio_app(app, demo, path="/")`
	- `@spaces.GPU` decorator still handles ZeroGPU GPU allocation independently
	Result: FAIL — RUNTIME_ERROR exit code 0. gr.mount_gradio_app() returns immediately; nothing blocks the process. Fixed in Iteration 4.

	---

	## Iteration 4 — 2026-04-11
	Stage: RUNTIME_ERROR — `Exit code: 0. Reason: ` (clean exit, process didn't stay alive)
	Root cause: `gr.mount_gradio_app()` returns an ASGI app object but doesn't start a server. Without `demo.launch()` blocking, `app.py` runs to completion and exits.
	Fix applied: Added `uvicorn.run(app, host="0.0.0.0", port=7860)` after the mount call to start the ASGI server and block the process.
	Result: FAIL — RUNTIME_ERROR "No @spaces.GPU function detected during startup". `uvicorn.run()` bypasses `spaces.zero.gradio` launch wrapper that scans for GPU functions. ZeroGPU requires `demo.launch()`. Fixed in Iteration 5.

	---

	## Iteration 5 — 2026-04-11
	Stage: RUNTIME_ERROR — `No @spaces.GPU function detected during startup`
	Root cause: `gr.mount_gradio_app()` + `uvicorn.run()` bypasses the `spaces.zero.gradio` interceptor of `demo.launch()`. ZeroGPU scans for `@spaces.GPU` decorated functions inside that interceptor — never gets called, so GPU functions aren't registered.
	Fix applied: Reverted to bare `demo.launch()`. Added `/health_hf` by monkey-patching `gradio.routes.App.create_app` to inject the route into the Gradio FastAPI app at creation time, before ZeroGPU starts the server.
	Result: FAIL — "Application unable to start for an unknown reason". The `create_app.__func__` access likely failed (AttributeError or TypeError) in Gradio 5.x, crashing startup silently. Fixed in Iteration 6.

	---

	## Iteration 6 — 2026-04-11
	Stage: RUNTIME_ERROR — "Application unable to start for an unknown reason"
	Root cause: Monkey-patching `gradio.routes.App.create_app.__func__` crashed at import/startup time in Gradio 5.x. The `__func__` access pattern assumes `create_app` is a classmethod — if the signature or descriptor changed in 5.x, this raises AttributeError and kills the process before any server starts.
	Fix applied: Replaced monkey-patch with a daemon thread that polls `demo.server` (set by Gradio after `demo.launch()` initializes the server) and injects `/health_hf` once available. `demo.launch()` stays bare — ZeroGPU detection works normally. Thread is a no-op if injection fails.
	Result: FAIL — Space is RUNNING but `/health_hf` still returns 308. `demo.server` is never set in the polling thread's context (ZeroGPU runs the real server in a GPU worker, not the same process). Fixed in Iteration 7.

	---

	## Iteration 7 — 2026-04-11
	Stage: RUNNING but `/health_hf` still returns 308
	Root cause: In ZeroGPU, the actual Gradio server runs in a separate GPU worker process. `demo.server` is never set in the main process, so the daemon thread's poll always fails and the route is never injected.
	Fix applied: Use `demo.launch(prevent_thread_lock=True)` — the spaces interceptor still detects `@spaces.GPU` functions, then starts the server in a background thread in the same process and returns. After `launch()` returns, `demo.server.app` is accessible and we add `/health_hf`. Main thread blocked via `threading.Event().wait()` (avoids relying on `demo.block_thread()` existing in Gradio 5.x).
	Result: FAIL — `AttributeError: 'Server' object has no attribute 'app'`. Gradio 5.x's `Server` wraps uvicorn — the FastAPI app lives at `server.config.app`, not `server.app`. Fixed in Iteration 8.

	---

	## Iteration 8 — 2026-04-11
	Stage: RUNTIME_ERROR — `AttributeError: 'Server' object has no attribute 'app'`
	Root cause: `demo.server` is a Gradio `Server` (wrapping uvicorn). In uvicorn, the ASGI app is stored in `server.config.app` (the `Config` object passed at construction), not directly on `server.app`.
	Fix applied: Changed `demo.server.app.get(...)` → `demo.server.config.app.get(...)`.
	Result: FAIL — Space RUNNING but `/health_hf` still 308. `demo.server.config.app.get()` adds route AFTER Gradio's catch-all `/{path_name:path}` is already registered. FastAPI matches routes in insertion order — catch-all added first wins. Fixed in Iteration 9.

	---

	## Iteration 9 — 2026-04-11
	Stage: RUNNING but `/health_hf` returns 308
	Root cause: Adding `@app.get("/health_hf")` after `create_app` appends the route AFTER Gradio's catch-all `/{path_name:path}`. FastAPI/Starlette matches routes in registration order — the catch-all was registered first and intercepts everything, including `/health_hf`.
	Fix applied: Use Starlette middleware (`BaseHTTPMiddleware`) patched into Gradio's `create_app`. Middleware runs BEFORE any route matching, so `/health_hf` is intercepted before the catch-all. Reverted to bare `demo.launch()` (ZeroGPU works). Entire patch wrapped in `try/except` so failures are silent and don't prevent startup.
	Result: PASS ✓ — Space RUNNING, `GET /health_hf` → `{"status":"ok"}` HTTP 200. All done after 9 iterations.

	---

	## Iteration 10 — 2026-04-12
	Stage: RUNNING but "Run Benchmark" throws OSError
	Error: `OSError: Could not load this library: /usr/local/lib/python3.10/site-packages/torchaudio/lib/_torchaudio.abi3.so`
	Root cause (via runtime logs): `qwen-tts` depends on `torchaudio`. `pip install qwen-tts` upgraded `torchaudio` to the latest PyPI release which was compiled against CUDA 13 (`libcudart.so.13`). ZeroGPU A10G runs CUDA 12, so `libcudart.so.13` is not present. Full import chain: `from qwen_tts import Qwen3TTSModel` → `speech_vq.py` → `import torchaudio.compliance.kaldi` → `torchaudio/__init__.py` → `torchaudio._extension` → `torch.ops.load_library("_torchaudio.abi3.so")` → `OSError: libcudart.so.13`.
	Fix applied: Pinned `torchaudio==2.5.1` in `requirements.txt` BEFORE the `qwen-tts` line. torchaudio 2.5.1 (Nov 2024) was compiled against CUDA 12 and prevents pip from upgrading to a CUDA-13 version. `kaldi.fbank()` (the only torchaudio function qwen-tts calls from this path) is a CPU-only DSP operation — no GPU needed.
	Result: PASS ✓ — Space RUNNING with new SHA 990b408, `/health_hf` → 200. Benchmark fix deployed.

	---

	## Iteration 11 — 2026-04-12
	Stage: RUNNING — benchmark redesign (not a build error)
	Change: Replaced static `np.zeros` reference + arbitrary test text with a live microphone enrollment simulation. New UI: user records one of the 3 frontend enrollment phrases via Gradio `Audio` input, benchmark clones their voice and synthesizes an AI response ("Great job! Now let's keep the conversation going. How was your day?"), returns RTF result + playable audio output. Mirrors the actual frontend UX: enroll → clone → hear AI response.
	Files changed: `app.py` only.
	Result: FAIL — space RUNNING, `/health_hf` 200, but Gradio API returns 500 Internal Server Error. UI loads but "Start →" button fails. See Iteration 12.

	---

	## Iteration 12 — 2026-04-13
	Stage: RUNNING but Gradio API `/gradio_api/info` returns 500 Internal Server Error
	Error: `File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 967, in _json_schema_to_python_type` — crash during API schema generation
	Root cause (via runtime logs): Gradio generates a JSON schema for all function signatures when serving `/gradio_api/info`. The `gpu_chat_turn` function had type hints `ref: np.ndarray, history: list, turn_count: int, l1: str, l2: str`. `gradio_client`'s `json_schema_to_python_type` in `_json_schema_to_python_type` cannot serialize `numpy.ndarray` into a JSON schema — it crashes on the list comprehension at line 967–968 trying to build property descriptions. This crash propagates through Starlette's middleware stack, resulting in a 500 on every request (including the frontend's queue/event polling calls).
	Fix applied:
	- Removed all type hints from `gpu_enroll_and_greet` and `gpu_chat_turn` signatures — Gradio's schema generator only inspects annotated parameters
	- Changed `gpu_enroll_and_greet` to return `ref.tolist()` (plain Python list) instead of `np.ndarray` — keeps State JSON-serializable
	- Changed `gpu_chat_turn` to accept `ref_list` (plain list) and convert to `np.ndarray` internally via `np.array(ref_list, dtype=np.float32)` before passing to `synthesize()`
	- No changes to callbacks — `on_enroll` stores whatever the function returns; `on_send` passes it through unchanged
	Files changed: `app.py` only.
	Result: FAIL — same crash persists. Removing np.ndarray type hints did not resolve it. Root cause was actually the gr.State(dict) itself, not the function signature. See Iteration 13.

	---

	## Iteration 13 — 2026-04-13
	Stage: RUNNING but `/gradio_api/info` still returns 500
	Error: `TypeError: argument of type 'bool' is not iterable` at `gradio_client/utils.py:882 → get_type → if "const" in schema`
	Root cause: Removing np.ndarray type hints in Iteration 12 did not fix the crash. The actual source is `gr.State({"l1": "en", "l2": "es", "ref": None, "history": [], "turn_count": 0})`. When Gradio generates the API schema for this State, it calls `_json_schema_to_python_type` on the dict schema. The dict's JSON Schema representation has `additionalProperties: True` (a Python bool, per JSON Schema spec). The schema generator then does `if "const" in schema` where `schema` is already a Python bool `True`, causing `TypeError: argument of type 'bool' is not iterable`. This happens in `gradio_client/utils.py` at line 882 regardless of function type hints — it's triggered by the State type itself.
	Fix applied: Replaced single `gr.State(dict)` with 5 flat, primitive `gr.State` objects:
	- `state_l1 = gr.State("en")` — string, safe
	- `state_l2 = gr.State("es")` — string, safe
	- `state_ref = gr.State([])` — empty list (no numpy), safe
	- `state_history = gr.State([])` — list of dicts (plain JSON), safe
	- `state_turn_count = gr.State(0)` — int, safe
	All callbacks updated to accept/return these flat states. `ref_list` (a Python list) is passed as `state_ref` and converted to `np.ndarray` inside `gpu_chat_turn` only. Full `app.py` rewrite.
	Files changed: `app.py` only.
	Result: PASS ✓ — Space RUNNING, Gradio UI fully functional (language select → enrollment → chat → wall at turn 7), `/health_hf` → 200. See session 2026-04-13 for subsequent full-backend migration.

	---

	## Iteration 14 — 2026-04-13 (session 2)
	Stage: Full backend migration attempt — `gr.mount_gradio_app()` approach
	Goal: Serve FastAPI REST API (all `/session/*` endpoints) alongside Gradio UI so the Vercel React frontend can talk directly to the HF Space instead of Railway.
	Approach: Replaced `demo.launch()` with `app = gr.mount_gradio_app(api, demo, path="/ui")` where `api` is a standalone `FastAPI()` instance with all endpoints defined as routes.
	Error: `RUNTIME_ERROR` — Space exits with code 0 (clean exit).
	Root cause: HF Spaces with `sdk: gradio` require `demo.launch()` to start and block the server. `gr.mount_gradio_app()` returns an ASGI app object but does not start a server — same as Iteration 4 (the process runs to completion and exits immediately).
	Fix applied: See Iteration 15.

	---

	## Iteration 15 — 2026-04-13 (session 2)
	Stage: RUNTIME_ERROR — Space exits code 0 after `gr.mount_gradio_app()`
	Approach: Switched to `include_router()` pattern: patched `gradio.routes.App.create_app` to call `gapp.include_router(_vmr)` (adding all API routes to Gradio's internal FastAPI app), then ended with `demo.launch()` to keep the process alive.
	Error: `GET /health` → `HTTP 308 Permanent Redirect` (location: `/`). All API routes return 308.
	Root cause: Gradio 5.x registers a catch-all SPA route `/{path_name:path}` during `create_app`. FastAPI matches routes in insertion order — the catch-all is registered first (inside Gradio's own `create_app` logic), so any routes added afterward via `include_router()` are never matched. Every unknown path gets 308-redirected to `/` before our routes are evaluated.
	Key lesson: `include_router()` appends routes AFTER the catch-all — they will never be reached in Gradio 5.x.
	Fix applied: See Iteration 16.

	---

	## Iteration 16 — 2026-04-13 (session 2)
	Stage: RUNNING but all API routes return 308 via `include_router()`
	Root cause: Same Gradio 5.x SPA catch-all issue as Iteration 9 but for custom routes instead of `/health_hf`. `include_router()` is append-only and cannot insert before the catch-all.
	Fix applied: Implemented all REST API endpoints as a single `BaseHTTPMiddleware` subclass (`_VocalMirrorAPI`) with regex-based path dispatch. Middleware runs BEFORE any route matching (same pattern that fixed `/health_hf` in Iteration 9). `demo.launch()` stays bare. Session state in-memory dict, audio in `/tmp/`, background thread for enrollment, `asyncio.run_in_executor` for `gpu_tts()` from async context.
	Result: PASS ✓
	- `GET /health` → `{"status":"ok"}` HTTP 200 ✓
	- `GET /vm-config` → `{"wall_turn_count":7}` ✓ (named `/vm-config` to avoid shadowing Gradio's own `/config`)
	- `POST /session/start` → returns session_id + word_list ✓
	- `GET /session/{id}/wall_status` → `{"show_wall":false,"turn_count":0}` ✓
	- Gradio product UI at `/` still fully functional ✓
	Space is RUNNING on zero-a10g with SHA ef1ba6a.