Spaces:
Running on Zero
Running on Zero
| # Failure Notes | |
| ## Purpose | |
| This file tracks reproducible failures and fallback behavior for Objectverse Diary. | |
| Use it for model/runtime/deployment/data issues, not for UI polish notes. | |
| ## Current Status | |
| MiniCPM-V 2.6 is wired as an optional vision backend. Hosted Space ZeroGPU validation now passes for public mug, keyboard, and shoe images after adding an `HF_TOKEN` Space secret with access to the gated `openbmb/MiniCPM-V-2_6` model. | |
| The app includes a hidden `/vision_runtime_probe` API and `scripts/check_space_vlm.py` writes probe output into the Space VLM report before image validation. This probe identified the previous failure as a gated-model access issue rather than a GPU or dependency issue. | |
| The published LoRA v2 GGUF for local text smoke testing is available and has passed local llama.cpp smoke: | |
| - repo: `qqyule/objectverse-diary-qwen15b-lora` | |
| - file: `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf` | |
| - helper: `scripts/check_llama_cpp_smoke.py` | |
| Known non-blocking warning: | |
| - Gradio emits deprecation warnings for upcoming 6.0 API changes during local tests. This does not break the current Gradio Blocks build and can be handled with the later UI/API polish pass. | |
| ## Failure Record Template | |
| ```markdown | |
| ## YYYY-MM-DD - Short Failure Name | |
| - Area: | |
| - Reproduction: | |
| - Expected: | |
| - Actual: | |
| - Impact: | |
| - Fallback used: | |
| - Resolution: | |
| - Evidence: | |
| ``` | |
| ## 2026-06-08 - Hosted ZeroGPU MiniCPM-V Falls Back To Mock | |
| - Area: Hugging Face Space vision runtime. | |
| - Reproduction: Run `scripts/check_space_vlm.py` with `--configure-space --hardware zero-a10g --rollback-to-mock` against `build-small-hackathon/ObjectverseDiary`. | |
| - Expected: mug, keyboard, and shoe validations use `minicpm-v object understanding` without `vision-fallback-to-mock`. | |
| - Actual: all three validations returned schema-valid traces, but every trace included `vision-fallback-to-mock`. | |
| - Impact: hosted Space MiniCPM-V evidence is not ready for submission; stable mock demo remains usable. | |
| - Fallback used: mock object understanding plus mock text runtime. | |
| - Resolution: unresolved; inspect Space runtime logs or add non-secret fallback diagnostics for the MiniCPM-V load/chat exception. | |
| - Evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`. | |
| ## 2026-06-08 - Hosted ZeroGPU MiniCPM-V Validation Passes After HF_TOKEN Secret | |
| - Area: Hugging Face Space vision runtime. | |
| - Reproduction: Run `scripts/check_space_vlm.py` with `--configure-space --hardware zero-a10g --rollback-to-mock` against `build-small-hackathon/ObjectverseDiary`. | |
| - Expected: mug, keyboard, and shoe validations use `minicpm-v object understanding` without `vision-fallback-to-mock`. | |
| - Actual: all three validations passed; probe reported `minicpm_load_ok=True`; traces include only `mock-text-runtime`. | |
| - Impact: OpenBMB / hosted MiniCPM-V vision evidence is ready. Public Space still rolls back to mock-safe defaults after validation. | |
| - Resolution: resolved by adding an `HF_TOKEN` Space secret with gated model access. | |
| - Evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`. | |
| ## Previous Space VLM Validation Failure | |
| - Updated: 2026-06-08 08:33:19 UTC | |
| - Area: Hugging Face Space vision runtime. | |
| - Probe backend: `minicpm-v` | |
| - MiniCPM load attempted: `True` | |
| - MiniCPM load ok: `False` | |
| - Probe errors: minicpm_load=OSError | |
| - Failed checks: mug: vision fallback marker was present; keyboard: vision fallback marker was present; shoe: vision fallback marker was present | |
| - Fallback used: mock object understanding plus mock text runtime if validation reaches generation. | |
| - Resolution: unresolved; keep the public Space mock-safe until this section reports a passing VLM validation. | |
| ## 2026-06-08 - LoRA v2 GGUF Local Smoke Passed | |
| - Area: llama.cpp text runtime evidence. | |
| - Reproduction: Run `scripts/check_llama_cpp_smoke.py` with `models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf` after optional `llama-cpp-python` installation. | |
| - Expected: trace records `llama-cpp text generation`, persona/diary/chat run without `text-fallback-to-mock`. | |
| - Actual: passed locally; trace included only `mock-vision-runtime` because text was real and vision remained mock for the smoke input. | |
| - Impact: local llama.cpp text runtime evidence is ready. Public Space text runtime is still not validated with this GGUF. | |
| - Fallback used: none for text. | |
| - Resolution: resolved locally by using the merged LoRA v2 Q4_K_M GGUF and conservative JSON extraction / decoding settings. | |
| - Evidence: `scripts/check_llama_cpp_smoke.py`, `docs/RUNTIME.md`, and the Hub file in `qqyule/objectverse-diary-qwen15b-lora`. | |
| ## 2026-06-08 - Hugging Face Xet GGUF Upload Stalled | |
| - Area: Hugging Face model file upload. | |
| - Reproduction: Upload `models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf` with the default Hub client path. | |
| - Expected: upload completes and commits `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`. | |
| - Actual: the first upload stalled with Xet TLS EOF / `CLOSE_WAIT` after partial progress. | |
| - Impact: upload needed a retry; local GGUF file was unaffected. | |
| - Fallback used: stopped the stalled upload process and retried with `HF_HUB_DISABLE_XET=1`. | |
| - Resolution: resolved; ordinary Hub/LFS upload succeeded. | |
| - Evidence: Hub file `https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora/blob/main/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`. | |
| ## Anticipated Failure Areas | |
| ### Vision Runtime | |
| - MiniCPM-V or fallback VLM fails to load in local or Space environment. | |
| - Uploaded image is unsupported, too large, or not an object photo. | |
| - Model output is not valid JSON. | |
| - Object recognition is too vague for persona generation. | |
| Fallback: | |
| - use manual object description | |
| - use stable example flow | |
| - record fallback marker in trace | |
| - `vision-fallback-to-mock` means MiniCPM-V failed or returned invalid JSON and mock object understanding was used. | |
| ### Text Runtime | |
| - GGUF model path is missing. | |
| - llama.cpp or llama-cpp-python import fails. | |
| - Generation output does not match `PersonaEnvelope` or `DiaryEntry` schema. | |
| - Chat response loses persona consistency. | |
| Fallback: | |
| - use deterministic mock text runtime | |
| - repair JSON before schema validation | |
| - record fallback marker in trace | |
| ### Dataset And Fine-Tuning | |
| - Candidate samples contain private information. | |
| - JSONL rows fail to parse. | |
| - Curated data is too repetitive. | |
| - Training run fails or adapter quality is weaker than mock prompts. | |
| Fallback: | |
| - keep mock preview data for schema validation only | |
| - separate raw candidates from curated rows | |
| - publish only reviewed examples | |
| ### Hugging Face Space | |
| - Space cannot start because of missing packages. | |
| - Model files are too large or not available. | |
| - CPU runtime is too slow. | |
| - Secrets or private paths appear in logs. | |
| Fallback: | |
| - launch with mock runtime first | |
| - keep model files out of the repo | |
| - document runtime mode clearly in README | |