File size: 16,134 Bytes
4220abe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54ae6b2
 
 
 
 
 
 
 
 
 
 
 
082eeec
 
 
 
 
 
 
 
 
 
 
1586f03
 
 
 
 
 
 
 
fe78cbe
 
 
 
 
 
 
 
554ad7f
 
 
 
 
 
 
 
4ff65b8
 
 
 
 
 
 
 
5ef8031
 
 
 
 
 
 
 
262f512
 
 
 
 
 
 
 
4f4369e
54ae6b2
990b408
 
 
 
 
 
 
2f166af
 
 
 
 
 
 
 
b1bd3b9
 
 
 
 
 
 
 
 
 
 
 
 
 
e75cf4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71b7030
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
990b408
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# HF Space Build Error Log β€” rubentuesday/vocal-mirror

This file is committed alongside every fix so the repo retains full context of what broke and why.

---

## Iteration 1 β€” 2026-04-11
**Stage:** CONFIG_ERROR  
**Error:** `No candidate PyTorch version found for ZeroGPU`  
**Root cause:** `requirements.txt` pinned `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` with `--extra-index-url https://download.pytorch.org/whl/cu121`. ZeroGPU manages its own CUDA PyTorch installation and rejects spaces that pin a `+cu121`-suffixed variant β€” it fails at config parse time before any package install.  
**Fix applied:**  
- Removed `--extra-index-url https://download.pytorch.org/whl/cu121` from `requirements.txt`  
- Removed `torch==2.5.1+cu121` and `torchaudio==2.5.1+cu121` from `requirements.txt` (ZeroGPU provides these)  
- Changed `gradio>=5.0.0,<6.0` β†’ `gradio==4.44.1` in `requirements.txt` (project rule: pin to 4.44.1)  
- Changed `sdk_version: 5.0.0` β†’ `sdk_version: 4.44.1` in `README.md` YAML frontmatter  
**Result:** FAIL β€” CONFIG_ERROR resolved, but caused new RUNTIME_ERROR (see Iteration 2). Gradio 4.44.1 was wrong choice β€” reverted.

---

## Iteration 2 β€” 2026-04-11
**Stage:** RUNTIME_ERROR  
**Error 1 (first):** `TypeError: unhashable type: 'dict'` in `jinja2/utils.py` β€” Gradio 4.x Jinja template cache bug  
**Error 2:** `ValueError: When localhost is not accessible, a shareable link must be created. Please set share=True` β€” Gradio 4.x requires share=True on remote hosts  
**Root cause:** Downgrading to `gradio==4.44.1` reintroduced two known Gradio 4.x bugs. The commit history already shows `c0a2ea8` explicitly upgraded to 5.x to fix the Jinja crash. Both errors are 4.x-only issues fixed in 5.x. The "pin to 4.44.1" instruction in the task brief was outdated.  
**Fix applied:**  
- Reverted `requirements.txt`: `gradio==4.44.1` β†’ `gradio>=5.0.0,<6.0`  
- Reverted `README.md`: `sdk_version: 4.44.1` β†’ `sdk_version: 5.0.0`  
**Result:** PASS β€” Space reached RUNNING stage. `/health_hf` returns 308 (route missing). Fixed in Iteration 3.

---

## Iteration 3 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` returns 308 Permanent Redirect (no such route)  
**Root cause:** `app.py` only has `demo.launch()` with no custom routes. Gradio 5.x redirects unknown paths to `/`.  
**Fix applied:** Switched from `demo.launch()` to `gr.mount_gradio_app()` pattern:
- Added `FastAPI` app with `@app.get("/health_hf")` returning `{"status": "ok"}`
- Replaced `demo.launch()` with `app = gr.mount_gradio_app(app, demo, path="/")`
- `@spaces.GPU` decorator still handles ZeroGPU GPU allocation independently  
**Result:** FAIL β€” RUNTIME_ERROR exit code 0. gr.mount_gradio_app() returns immediately; nothing blocks the process. Fixed in Iteration 4.

---

## Iteration 4 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `Exit code: 0. Reason: ` (clean exit, process didn't stay alive)
**Root cause:** `gr.mount_gradio_app()` returns an ASGI app object but doesn't start a server. Without `demo.launch()` blocking, `app.py` runs to completion and exits.
**Fix applied:** Added `uvicorn.run(app, host="0.0.0.0", port=7860)` after the mount call to start the ASGI server and block the process.
**Result:** FAIL β€” RUNTIME_ERROR "No @spaces.GPU function detected during startup". `uvicorn.run()` bypasses `spaces.zero.gradio` launch wrapper that scans for GPU functions. ZeroGPU requires `demo.launch()`. Fixed in Iteration 5.

---

## Iteration 5 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `No @spaces.GPU function detected during startup`
**Root cause:** `gr.mount_gradio_app()` + `uvicorn.run()` bypasses the `spaces.zero.gradio` interceptor of `demo.launch()`. ZeroGPU scans for `@spaces.GPU` decorated functions inside that interceptor β€” never gets called, so GPU functions aren't registered.  
**Fix applied:** Reverted to bare `demo.launch()`. Added `/health_hf` by monkey-patching `gradio.routes.App.create_app` to inject the route into the Gradio FastAPI app at creation time, before ZeroGPU starts the server.  
**Result:** FAIL β€” "Application unable to start for an unknown reason". The `create_app.__func__` access likely failed (AttributeError or TypeError) in Gradio 5.x, crashing startup silently. Fixed in Iteration 6.

---

## Iteration 6 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” "Application unable to start for an unknown reason"  
**Root cause:** Monkey-patching `gradio.routes.App.create_app.__func__` crashed at import/startup time in Gradio 5.x. The `__func__` access pattern assumes `create_app` is a classmethod β€” if the signature or descriptor changed in 5.x, this raises AttributeError and kills the process before any server starts.  
**Fix applied:** Replaced monkey-patch with a daemon thread that polls `demo.server` (set by Gradio after `demo.launch()` initializes the server) and injects `/health_hf` once available. `demo.launch()` stays bare β€” ZeroGPU detection works normally. Thread is a no-op if injection fails.  
**Result:** FAIL β€” Space is RUNNING but `/health_hf` still returns 308. `demo.server` is never set in the polling thread's context (ZeroGPU runs the real server in a GPU worker, not the same process). Fixed in Iteration 7.

---

## Iteration 7 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` still returns 308  
**Root cause:** In ZeroGPU, the actual Gradio server runs in a separate GPU worker process. `demo.server` is never set in the main process, so the daemon thread's poll always fails and the route is never injected.  
**Fix applied:** Use `demo.launch(prevent_thread_lock=True)` β€” the spaces interceptor still detects `@spaces.GPU` functions, then starts the server in a background thread in the same process and returns. After `launch()` returns, `demo.server.app` is accessible and we add `/health_hf`. Main thread blocked via `threading.Event().wait()` (avoids relying on `demo.block_thread()` existing in Gradio 5.x).  
**Result:** FAIL β€” `AttributeError: 'Server' object has no attribute 'app'`. Gradio 5.x's `Server` wraps uvicorn β€” the FastAPI app lives at `server.config.app`, not `server.app`. Fixed in Iteration 8.

---

## Iteration 8 β€” 2026-04-11
**Stage:** RUNTIME_ERROR β€” `AttributeError: 'Server' object has no attribute 'app'`  
**Root cause:** `demo.server` is a Gradio `Server` (wrapping uvicorn). In uvicorn, the ASGI app is stored in `server.config.app` (the `Config` object passed at construction), not directly on `server.app`.  
**Fix applied:** Changed `demo.server.app.get(...)` β†’ `demo.server.config.app.get(...)`.  
**Result:** FAIL β€” Space RUNNING but `/health_hf` still 308. `demo.server.config.app.get()` adds route AFTER Gradio's catch-all `/{path_name:path}` is already registered. FastAPI matches routes in insertion order β€” catch-all added first wins. Fixed in Iteration 9.

---

## Iteration 9 β€” 2026-04-11
**Stage:** RUNNING but `/health_hf` returns 308  
**Root cause:** Adding `@app.get("/health_hf")` after `create_app` appends the route AFTER Gradio's catch-all `/{path_name:path}`. FastAPI/Starlette matches routes in registration order β€” the catch-all was registered first and intercepts everything, including `/health_hf`.  
**Fix applied:** Use Starlette middleware (`BaseHTTPMiddleware`) patched into Gradio's `create_app`. Middleware runs BEFORE any route matching, so `/health_hf` is intercepted before the catch-all. Reverted to bare `demo.launch()` (ZeroGPU works). Entire patch wrapped in `try/except` so failures are silent and don't prevent startup.  
**Result:** PASS βœ“ β€” Space RUNNING, `GET /health_hf` β†’ `{"status":"ok"}` HTTP 200. All done after 9 iterations.

---

## Iteration 10 β€” 2026-04-12
**Stage:** RUNNING but "Run Benchmark" throws OSError  
**Error:** `OSError: Could not load this library: /usr/local/lib/python3.10/site-packages/torchaudio/lib/_torchaudio.abi3.so`  
**Root cause (via runtime logs):** `qwen-tts` depends on `torchaudio`. `pip install qwen-tts` upgraded `torchaudio` to the latest PyPI release which was compiled against CUDA 13 (`libcudart.so.13`). ZeroGPU A10G runs CUDA 12, so `libcudart.so.13` is not present. Full import chain: `from qwen_tts import Qwen3TTSModel` β†’ `speech_vq.py` β†’ `import torchaudio.compliance.kaldi` β†’ `torchaudio/__init__.py` β†’ `torchaudio._extension` β†’ `torch.ops.load_library("_torchaudio.abi3.so")` β†’ `OSError: libcudart.so.13`.  
**Fix applied:** Pinned `torchaudio==2.5.1` in `requirements.txt` BEFORE the `qwen-tts` line. torchaudio 2.5.1 (Nov 2024) was compiled against CUDA 12 and prevents pip from upgrading to a CUDA-13 version. `kaldi.fbank()` (the only torchaudio function qwen-tts calls from this path) is a CPU-only DSP operation β€” no GPU needed.  
**Result:** PASS βœ“ β€” Space RUNNING with new SHA 990b408, `/health_hf` β†’ 200. Benchmark fix deployed.

---

## Iteration 11 β€” 2026-04-12
**Stage:** RUNNING β€” benchmark redesign (not a build error)
**Change:** Replaced static `np.zeros` reference + arbitrary test text with a live microphone enrollment simulation. New UI: user records one of the 3 frontend enrollment phrases via Gradio `Audio` input, benchmark clones their voice and synthesizes an AI response ("Great job! Now let's keep the conversation going. How was your day?"), returns RTF result + playable audio output. Mirrors the actual frontend UX: enroll β†’ clone β†’ hear AI response.
**Files changed:** `app.py` only.
**Result:** FAIL β€” space RUNNING, `/health_hf` 200, but Gradio API returns 500 Internal Server Error. UI loads but "Start β†’" button fails. See Iteration 12.

---

## Iteration 12 β€” 2026-04-13
**Stage:** RUNNING but Gradio API `/gradio_api/info` returns 500 Internal Server Error  
**Error:** `File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 967, in _json_schema_to_python_type` β€” crash during API schema generation  
**Root cause (via runtime logs):** Gradio generates a JSON schema for all function signatures when serving `/gradio_api/info`. The `gpu_chat_turn` function had type hints `ref: np.ndarray, history: list, turn_count: int, l1: str, l2: str`. `gradio_client`'s `json_schema_to_python_type` in `_json_schema_to_python_type` cannot serialize `numpy.ndarray` into a JSON schema β€” it crashes on the list comprehension at line 967–968 trying to build property descriptions. This crash propagates through Starlette's middleware stack, resulting in a 500 on every request (including the frontend's queue/event polling calls).  
**Fix applied:**  
- Removed all type hints from `gpu_enroll_and_greet` and `gpu_chat_turn` signatures β€” Gradio's schema generator only inspects annotated parameters  
- Changed `gpu_enroll_and_greet` to return `ref.tolist()` (plain Python list) instead of `np.ndarray` β€” keeps State JSON-serializable  
- Changed `gpu_chat_turn` to accept `ref_list` (plain list) and convert to `np.ndarray` internally via `np.array(ref_list, dtype=np.float32)` before passing to `synthesize()`  
- No changes to callbacks β€” `on_enroll` stores whatever the function returns; `on_send` passes it through unchanged  
**Files changed:** `app.py` only.  
**Result:** FAIL β€” same crash persists. Removing np.ndarray type hints did not resolve it. Root cause was actually the gr.State(dict) itself, not the function signature. See Iteration 13.

---

## Iteration 13 β€” 2026-04-13
**Stage:** RUNNING but `/gradio_api/info` still returns 500  
**Error:** `TypeError: argument of type 'bool' is not iterable` at `gradio_client/utils.py:882 β†’ get_type β†’ if "const" in schema`  
**Root cause:** Removing np.ndarray type hints in Iteration 12 did not fix the crash. The actual source is `gr.State({"l1": "en", "l2": "es", "ref": None, "history": [], "turn_count": 0})`. When Gradio generates the API schema for this State, it calls `_json_schema_to_python_type` on the dict schema. The dict's JSON Schema representation has `additionalProperties: True` (a Python bool, per JSON Schema spec). The schema generator then does `if "const" in schema` where `schema` is already a Python bool `True`, causing `TypeError: argument of type 'bool' is not iterable`. This happens in `gradio_client/utils.py` at line 882 regardless of function type hints β€” it's triggered by the State type itself.  
**Fix applied:** Replaced single `gr.State(dict)` with **5 flat, primitive `gr.State` objects**:
- `state_l1 = gr.State("en")` β€” string, safe
- `state_l2 = gr.State("es")` β€” string, safe  
- `state_ref = gr.State([])` β€” empty list (no numpy), safe
- `state_history = gr.State([])` β€” list of dicts (plain JSON), safe
- `state_turn_count = gr.State(0)` β€” int, safe
All callbacks updated to accept/return these flat states. `ref_list` (a Python list) is passed as `state_ref` and converted to `np.ndarray` inside `gpu_chat_turn` only. Full `app.py` rewrite.  
**Files changed:** `app.py` only.  
**Result:** PASS βœ“ β€” Space RUNNING, Gradio UI fully functional (language select β†’ enrollment β†’ chat β†’ wall at turn 7), `/health_hf` β†’ 200. See session 2026-04-13 for subsequent full-backend migration.

---

## Iteration 14 β€” 2026-04-13 (session 2)
**Stage:** Full backend migration attempt β€” `gr.mount_gradio_app()` approach  
**Goal:** Serve FastAPI REST API (all `/session/*` endpoints) alongside Gradio UI so the Vercel React frontend can talk directly to the HF Space instead of Railway.  
**Approach:** Replaced `demo.launch()` with `app = gr.mount_gradio_app(api, demo, path="/ui")` where `api` is a standalone `FastAPI()` instance with all endpoints defined as routes.  
**Error:** `RUNTIME_ERROR` β€” Space exits with code 0 (clean exit).  
**Root cause:** HF Spaces with `sdk: gradio` require `demo.launch()` to start and block the server. `gr.mount_gradio_app()` returns an ASGI app object but does not start a server β€” same as Iteration 4 (the process runs to completion and exits immediately).  
**Fix applied:** See Iteration 15.

---

## Iteration 15 β€” 2026-04-13 (session 2)
**Stage:** RUNTIME_ERROR β€” Space exits code 0 after `gr.mount_gradio_app()`  
**Approach:** Switched to `include_router()` pattern: patched `gradio.routes.App.create_app` to call `gapp.include_router(_vmr)` (adding all API routes to Gradio's internal FastAPI app), then ended with `demo.launch()` to keep the process alive.  
**Error:** `GET /health` β†’ `HTTP 308 Permanent Redirect` (location: `/`). All API routes return 308.  
**Root cause:** Gradio 5.x registers a catch-all SPA route `/{path_name:path}` during `create_app`. FastAPI matches routes in insertion order β€” the catch-all is registered first (inside Gradio's own `create_app` logic), so any routes added afterward via `include_router()` are never matched. Every unknown path gets 308-redirected to `/` before our routes are evaluated.  
**Key lesson:** `include_router()` appends routes AFTER the catch-all β€” they will never be reached in Gradio 5.x.  
**Fix applied:** See Iteration 16.

---

## Iteration 16 β€” 2026-04-13 (session 2)
**Stage:** RUNNING but all API routes return 308 via `include_router()`  
**Root cause:** Same Gradio 5.x SPA catch-all issue as Iteration 9 but for custom routes instead of `/health_hf`. `include_router()` is append-only and cannot insert before the catch-all.  
**Fix applied:** Implemented all REST API endpoints as a single `BaseHTTPMiddleware` subclass (`_VocalMirrorAPI`) with regex-based path dispatch. Middleware runs BEFORE any route matching (same pattern that fixed `/health_hf` in Iteration 9). `demo.launch()` stays bare. Session state in-memory dict, audio in `/tmp/`, background thread for enrollment, `asyncio.run_in_executor` for `gpu_tts()` from async context.  
**Result:** PASS βœ“  
- `GET /health` β†’ `{"status":"ok"}` HTTP 200 βœ“  
- `GET /vm-config` β†’ `{"wall_turn_count":7}` βœ“  (named `/vm-config` to avoid shadowing Gradio's own `/config`)  
- `POST /session/start` β†’ returns session_id + word_list βœ“  
- `GET /session/{id}/wall_status` β†’ `{"show_wall":false,"turn_count":0}` βœ“  
- Gradio product UI at `/` still fully functional βœ“  
Space is RUNNING on zero-a10g with SHA ef1ba6a.