Spaces:
Running on Zero
Running on Zero
Deploy Hub GGUF downloader runtime
Browse files- README.md +29 -18
- docs/RUNTIME.md +37 -8
- requirements.txt +2 -0
- src/config.py +17 -3
- src/models/llama_cpp_runner.py +79 -8
- tests/test_mock_mvp.py +100 -0
README.md
CHANGED
|
@@ -23,15 +23,15 @@ Upload a photo of any everyday object. The app wakes it up, gives it a secret pe
|
|
| 23 |
|
| 24 |
## Current Status
|
| 25 |
|
| 26 |
-
Stable mock-safe submission baseline, MiniCPM-V vision backend wiring, non-secret hosted vision diagnostics, optional llama.cpp text runtime wiring, a local GGUF smoke
|
| 27 |
|
| 28 |
By default, the app uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. This keeps the public demo reproducible and avoids commercial AI APIs.
|
| 29 |
|
| 30 |
`OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the optional MiniCPM-V 2.6 vision path. The hosted ZeroGPU validation on June 8, 2026 passed for public mug, keyboard, and shoe images after the Space received an `HF_TOKEN` secret with access to the gated `openbmb/MiniCPM-V-2_6` model. The public Space still rolls back to mock mode after validation so the default demo remains stable.
|
| 31 |
|
| 32 |
-
`OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured.
|
| 33 |
|
| 34 |
-
`scripts/check_llama_cpp_smoke.py`
|
| 35 |
|
| 36 |
Hugging Face Space:
|
| 37 |
|
|
@@ -61,14 +61,14 @@ The interface is English-first and Chinese-second.
|
|
| 61 |
- [x] Sharing is Caring — public mock traces, JSONL export, prompt templates, and failure notes.
|
| 62 |
- [x] Field Notes — article draft in `docs/FIELD_NOTES.md`.
|
| 63 |
- [x] OpenBMB Special — MiniCPM-V 2.6 wiring exists and hosted ZeroGPU validation passed for mug, keyboard, and shoe.
|
| 64 |
-
- [
|
| 65 |
-
- [x] Well-Tuned — synthetic curated SFT dataset and Qwen 1.5B LoRA
|
| 66 |
- [ ] Off the Grid — no commercial AI APIs are used; final badge eligibility depends on hackathon review.
|
| 67 |
|
| 68 |
## Planned Model Stack
|
| 69 |
|
| 70 |
- Vision: MiniCPM-V 2.6 or deterministic mock fallback
|
| 71 |
-
- Text: deterministic mock text
|
| 72 |
- Runtime: llama.cpp / llama-cpp-python
|
| 73 |
- UI: Gradio Blocks
|
| 74 |
|
|
@@ -81,8 +81,8 @@ Stable baseline:
|
|
| 81 |
- default vision backend: deterministic mock, 0 active model parameters
|
| 82 |
- default text backend: deterministic mock, 0 active model parameters
|
| 83 |
- optional wired vision model: MiniCPM-V 2.6, about 8B parameters when enabled
|
| 84 |
-
- optional text base for published LoRA adapter: Qwen/Qwen2.5-1.5B-Instruct, about 1.5B parameters
|
| 85 |
-
- optional text GGUF:
|
| 86 |
|
| 87 |
The stable public demo therefore stays within the 32B budget. Optional MiniCPM-V plus Qwen 1.5B remains about 9.5B plus a small LoRA adapter, safely under the 32B budget.
|
| 88 |
|
|
@@ -97,26 +97,36 @@ Then open the local Gradio URL printed in the terminal.
|
|
| 97 |
|
| 98 |
## Optional llama.cpp Text Runtime
|
| 99 |
|
| 100 |
-
The project does not commit GGUF files
|
| 101 |
|
| 102 |
```bash
|
| 103 |
-
pip install llama-cpp-python
|
| 104 |
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
|
| 105 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
|
| 106 |
python app.py
|
| 107 |
```
|
| 108 |
|
| 109 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 110 |
|
| 111 |
Recommended explicit-confirmation smoke path:
|
| 112 |
|
| 113 |
```bash
|
| 114 |
-
# Download externally, do not commit the GGUF:
|
| 115 |
-
# https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF
|
| 116 |
-
# file: qwen2.5-1.5b-instruct-q4_k_m.gguf
|
| 117 |
-
|
| 118 |
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
|
| 119 |
-
--model-path models/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 120 |
```
|
| 121 |
|
| 122 |
## Initial MVP Flow
|
|
@@ -141,8 +151,9 @@ The stable submission baseline supports:
|
|
| 141 |
- Initial acceptance report: `docs/INITIAL_STAGE_REPORT.md`
|
| 142 |
- Runtime notes: `docs/RUNTIME.md`
|
| 143 |
- Dataset preview notes: `docs/DATASET.md`
|
| 144 |
-
- Synthetic curated dataset: https://huggingface.co/datasets/qqyule/objectverse-diary-sft-curated
|
| 145 |
-
- Fine-tuned LoRA adapter: https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora
|
|
|
|
| 146 |
- Public mock traces: `data/traces/samples/`
|
| 147 |
- Trace JSONL export: `data/traces/samples/objectverse_public_mock_traces.jsonl`
|
| 148 |
- Hosted VLM validation evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, `data/traces/space-vlm/`
|
|
|
|
| 23 |
|
| 24 |
## Current Status
|
| 25 |
|
| 26 |
+
Stable mock-safe submission baseline, MiniCPM-V vision backend wiring, non-secret hosted vision diagnostics, optional llama.cpp text runtime wiring, a passing local LoRA v2 GGUF smoke test, public mock traces, Space validation evidence, a published curated v2 SFT dataset, a published Qwen 1.5B LoRA v2 adapter, and a published Q4_K_M GGUF are available.
|
| 27 |
|
| 28 |
By default, the app uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. This keeps the public demo reproducible and avoids commercial AI APIs.
|
| 29 |
|
| 30 |
`OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the optional MiniCPM-V 2.6 vision path. The hosted ZeroGPU validation on June 8, 2026 passed for public mug, keyboard, and shoe images after the Space received an `HF_TOKEN` secret with access to the gated `openbmb/MiniCPM-V-2_6` model. The public Space still rolls back to mock mode after validation so the default demo remains stable.
|
| 31 |
|
| 32 |
+
`OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured. The Modal-trained LoRA v2 adapter has been merged with `Qwen/Qwen2.5-1.5B-Instruct`, quantized to Q4_K_M, uploaded to the same model repo, and smoke-tested locally through llama.cpp. No GGUF file is committed in Git, and the public Space is still kept on the mock-safe text runtime until a separate Space validation pass is run.
|
| 33 |
|
| 34 |
+
`scripts/check_llama_cpp_smoke.py` passed locally on June 8, 2026 with `models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`. The published GGUF is available in `qqyule/objectverse-diary-qwen15b-lora` as `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`.
|
| 35 |
|
| 36 |
Hugging Face Space:
|
| 37 |
|
|
|
|
| 61 |
- [x] Sharing is Caring — public mock traces, JSONL export, prompt templates, and failure notes.
|
| 62 |
- [x] Field Notes — article draft in `docs/FIELD_NOTES.md`.
|
| 63 |
- [x] OpenBMB Special — MiniCPM-V 2.6 wiring exists and hosted ZeroGPU validation passed for mug, keyboard, and shoe.
|
| 64 |
+
- [x] Llama Champion — local llama.cpp GGUF runtime passed with the published LoRA v2 Q4_K_M model; Space text runtime remains mock-safe.
|
| 65 |
+
- [x] Well-Tuned — synthetic curated v2 SFT dataset and Qwen 1.5B LoRA v2 adapter are published.
|
| 66 |
- [ ] Off the Grid — no commercial AI APIs are used; final badge eligibility depends on hackathon review.
|
| 67 |
|
| 68 |
## Planned Model Stack
|
| 69 |
|
| 70 |
- Vision: MiniCPM-V 2.6 or deterministic mock fallback
|
| 71 |
+
- Text: deterministic mock text by default; optional published Qwen 1.5B LoRA v2 Q4_K_M GGUF for local llama.cpp runtime
|
| 72 |
- Runtime: llama.cpp / llama-cpp-python
|
| 73 |
- UI: Gradio Blocks
|
| 74 |
|
|
|
|
| 81 |
- default vision backend: deterministic mock, 0 active model parameters
|
| 82 |
- default text backend: deterministic mock, 0 active model parameters
|
| 83 |
- optional wired vision model: MiniCPM-V 2.6, about 8B parameters when enabled
|
| 84 |
+
- optional text base for published LoRA v2 adapter: Qwen/Qwen2.5-1.5B-Instruct, about 1.5B parameters
|
| 85 |
+
- optional text GGUF: published `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`, about 1.5B base parameters plus a small merged LoRA delta; not committed to Git
|
| 86 |
|
| 87 |
The stable public demo therefore stays within the 32B budget. Optional MiniCPM-V plus Qwen 1.5B remains about 9.5B plus a small LoRA adapter, safely under the 32B budget.
|
| 88 |
|
|
|
|
| 97 |
|
| 98 |
## Optional llama.cpp Text Runtime
|
| 99 |
|
| 100 |
+
The project does not commit GGUF files. The Space dependencies include `llama-cpp-python`, but the model is only used when `OBJECTVERSE_TEXT_BACKEND=llama-cpp`. To try a local GGUF text model:
|
| 101 |
|
| 102 |
```bash
|
|
|
|
| 103 |
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
|
| 104 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
|
| 105 |
python app.py
|
| 106 |
```
|
| 107 |
|
| 108 |
+
For Hugging Face Space runtime, use Hub download variables instead of committing the GGUF:
|
| 109 |
+
|
| 110 |
+
```bash
|
| 111 |
+
OBJECTVERSE_TEXT_BACKEND=llama-cpp
|
| 112 |
+
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
|
| 113 |
+
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
If `llama-cpp-python` is missing, no local or Hub model source is configured, the model cannot download/load, or the model returns invalid JSON, the app falls back to deterministic mock text generation and records `text-fallback-to-mock` in traces.
|
| 117 |
|
| 118 |
Recommended explicit-confirmation smoke path:
|
| 119 |
|
| 120 |
```bash
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
|
| 122 |
+
--model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 123 |
+
```
|
| 124 |
+
|
| 125 |
+
Published GGUF source:
|
| 126 |
+
|
| 127 |
+
```text
|
| 128 |
+
repo: qqyule/objectverse-diary-qwen15b-lora
|
| 129 |
+
file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 130 |
```
|
| 131 |
|
| 132 |
## Initial MVP Flow
|
|
|
|
| 151 |
- Initial acceptance report: `docs/INITIAL_STAGE_REPORT.md`
|
| 152 |
- Runtime notes: `docs/RUNTIME.md`
|
| 153 |
- Dataset preview notes: `docs/DATASET.md`
|
| 154 |
+
- Synthetic curated v2 dataset: https://huggingface.co/datasets/qqyule/objectverse-diary-sft-curated
|
| 155 |
+
- Fine-tuned LoRA v2 adapter: https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora
|
| 156 |
+
- LoRA v2 Q4_K_M GGUF: https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora/blob/main/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 157 |
- Public mock traces: `data/traces/samples/`
|
| 158 |
- Trace JSONL export: `data/traces/samples/objectverse_public_mock_traces.jsonl`
|
| 159 |
- Hosted VLM validation evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, `data/traces/space-vlm/`
|
docs/RUNTIME.md
CHANGED
|
@@ -29,16 +29,33 @@ This only replaces object understanding. Persona generation, diary generation, a
|
|
| 29 |
Optional llama.cpp text generation can be enabled without changing the UI:
|
| 30 |
|
| 31 |
```bash
|
| 32 |
-
pip install llama-cpp-python
|
| 33 |
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
|
| 34 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
|
| 35 |
.venv/bin/python app.py
|
| 36 |
```
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
The runtime trace intentionally records only whether an external GGUF path was configured, not the literal `TEXT_MODEL_PATH`, so local private paths do not leak into public traces.
|
| 41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
## Runtime Diagnostics
|
| 43 |
|
| 44 |
The Gradio app exposes two hidden diagnostic APIs:
|
|
@@ -52,19 +69,19 @@ These APIs are for validation scripts and are not visible in the main UI. They m
|
|
| 52 |
|
| 53 |
## Optional GGUF Smoke Test
|
| 54 |
|
| 55 |
-
Recommended
|
| 56 |
|
| 57 |
```text
|
| 58 |
-
repo:
|
| 59 |
-
file:
|
| 60 |
-
local path: models/
|
| 61 |
```
|
| 62 |
|
| 63 |
-
The `models/` directory and `*.gguf` are ignored by Git. After downloading the file externally and installing optional `llama-cpp-python`
|
| 64 |
|
| 65 |
```bash
|
| 66 |
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
|
| 67 |
-
--model-path models/
|
| 68 |
```
|
| 69 |
|
| 70 |
A passing smoke test must show `llama-cpp text generation` and must not include `text-fallback-to-mock` in either generation or chat fallback markers.
|
|
@@ -76,6 +93,9 @@ OBJECTVERSE_VISION_BACKEND=mock
|
|
| 76 |
OBJECTVERSE_TEXT_BACKEND=mock
|
| 77 |
VISION_MODEL_ID=
|
| 78 |
TEXT_MODEL_PATH=
|
|
|
|
|
|
|
|
|
|
| 79 |
TRACE_OUTPUT_DIR=data/traces
|
| 80 |
```
|
| 81 |
|
|
@@ -96,6 +116,15 @@ OBJECTVERSE_TEXT_BACKEND=llama-cpp
|
|
| 96 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
|
| 97 |
```
|
| 98 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 99 |
Do not commit GGUF files or private model paths.
|
| 100 |
|
| 101 |
## Future Runtime Boundary
|
|
|
|
| 29 |
Optional llama.cpp text generation can be enabled without changing the UI:
|
| 30 |
|
| 31 |
```bash
|
|
|
|
| 32 |
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
|
| 33 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
|
| 34 |
.venv/bin/python app.py
|
| 35 |
```
|
| 36 |
|
| 37 |
+
For a hosted Space where the GGUF is stored on Hugging Face Hub instead of the local filesystem, configure the Hub source instead of `TEXT_MODEL_PATH`:
|
| 38 |
+
|
| 39 |
+
```bash
|
| 40 |
+
OBJECTVERSE_TEXT_BACKEND=llama-cpp
|
| 41 |
+
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
|
| 42 |
+
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
`TEXT_MODEL_REVISION` is optional and defaults to the Hub repo default branch. If `TEXT_MODEL_PATH` is set, it takes precedence over Hub download variables.
|
| 46 |
+
|
| 47 |
+
`llama-cpp-python` and `huggingface_hub` are installed by the Space runtime dependencies. Missing package, missing model path, download errors, model loading errors, invalid JSON, or schema validation errors all fall back to deterministic mock text generation.
|
| 48 |
|
| 49 |
The runtime trace intentionally records only whether an external GGUF path was configured, not the literal `TEXT_MODEL_PATH`, so local private paths do not leak into public traces.
|
| 50 |
|
| 51 |
+
Local LoRA v2 GGUF status:
|
| 52 |
+
|
| 53 |
+
- Base model: `Qwen/Qwen2.5-1.5B-Instruct`
|
| 54 |
+
- Adapter / GGUF repo: `qqyule/objectverse-diary-qwen15b-lora`
|
| 55 |
+
- Published GGUF: `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`
|
| 56 |
+
- Local smoke: passed on 2026-06-08 with `llama-cpp text generation` and no `text-fallback-to-mock`
|
| 57 |
+
- Space runtime: not switched to llama.cpp yet; the public Space text path remains mock-safe until a separate Space validation passes
|
| 58 |
+
|
| 59 |
## Runtime Diagnostics
|
| 60 |
|
| 61 |
The Gradio app exposes two hidden diagnostic APIs:
|
|
|
|
| 69 |
|
| 70 |
## Optional GGUF Smoke Test
|
| 71 |
|
| 72 |
+
Recommended LoRA v2 smoke model:
|
| 73 |
|
| 74 |
```text
|
| 75 |
+
repo: qqyule/objectverse-diary-qwen15b-lora
|
| 76 |
+
file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 77 |
+
local path: models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 78 |
```
|
| 79 |
|
| 80 |
+
The `models/` directory and `*.gguf` are ignored by Git. After downloading the file externally and installing optional `llama-cpp-python`, run:
|
| 81 |
|
| 82 |
```bash
|
| 83 |
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
|
| 84 |
+
--model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 85 |
```
|
| 86 |
|
| 87 |
A passing smoke test must show `llama-cpp text generation` and must not include `text-fallback-to-mock` in either generation or chat fallback markers.
|
|
|
|
| 93 |
OBJECTVERSE_TEXT_BACKEND=mock
|
| 94 |
VISION_MODEL_ID=
|
| 95 |
TEXT_MODEL_PATH=
|
| 96 |
+
TEXT_MODEL_REPO_ID=
|
| 97 |
+
TEXT_MODEL_FILENAME=
|
| 98 |
+
TEXT_MODEL_REVISION=
|
| 99 |
TRACE_OUTPUT_DIR=data/traces
|
| 100 |
```
|
| 101 |
|
|
|
|
| 116 |
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
|
| 117 |
```
|
| 118 |
|
| 119 |
+
For a Space runtime that should download the published LoRA v2 GGUF from Hub, set:
|
| 120 |
+
|
| 121 |
+
```bash
|
| 122 |
+
OBJECTVERSE_VISION_BACKEND=mock
|
| 123 |
+
OBJECTVERSE_TEXT_BACKEND=llama-cpp
|
| 124 |
+
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
|
| 125 |
+
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
|
| 126 |
+
```
|
| 127 |
+
|
| 128 |
Do not commit GGUF files or private model paths.
|
| 129 |
|
| 130 |
## Future Runtime Boundary
|
requirements.txt
CHANGED
|
@@ -7,3 +7,5 @@ Pillow
|
|
| 7 |
sentencepiece
|
| 8 |
accelerate
|
| 9 |
spaces>=0.30
|
|
|
|
|
|
|
|
|
| 7 |
sentencepiece
|
| 8 |
accelerate
|
| 9 |
spaces>=0.30
|
| 10 |
+
huggingface_hub>=0.34,<1
|
| 11 |
+
llama-cpp-python>=0.3,<0.4
|
src/config.py
CHANGED
|
@@ -26,6 +26,9 @@ class RuntimeSettings:
|
|
| 26 |
vision_backend: str
|
| 27 |
text_backend: str
|
| 28 |
text_model_path: str
|
|
|
|
|
|
|
|
|
|
| 29 |
vision_model_id: str
|
| 30 |
trace_output_dir: Path
|
| 31 |
|
|
@@ -36,6 +39,9 @@ def get_runtime_settings(environ: Mapping[str, str] | None = None) -> RuntimeSet
|
|
| 36 |
vision_backend=env.get("OBJECTVERSE_VISION_BACKEND", "mock"),
|
| 37 |
text_backend=env.get("OBJECTVERSE_TEXT_BACKEND", "mock"),
|
| 38 |
text_model_path=env.get("TEXT_MODEL_PATH", ""),
|
|
|
|
|
|
|
|
|
|
| 39 |
vision_model_id=env.get("VISION_MODEL_ID", ""),
|
| 40 |
trace_output_dir=Path(env.get("TRACE_OUTPUT_DIR", "data/traces")),
|
| 41 |
)
|
|
@@ -61,13 +67,21 @@ def runtime_status(settings: RuntimeSettings | None = None) -> dict[str, str]:
|
|
| 61 |
if text_backend == "mock":
|
| 62 |
runtime_parts.append("no llama.cpp model connected yet")
|
| 63 |
else:
|
| 64 |
-
runtime_parts.append(f"text model
|
| 65 |
runtime = "; ".join(runtime_parts)
|
| 66 |
return {"vision": vision, "text": text, "runtime": runtime}
|
| 67 |
|
| 68 |
|
| 69 |
-
def
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
|
| 73 |
SETTINGS = get_runtime_settings()
|
|
|
|
| 26 |
vision_backend: str
|
| 27 |
text_backend: str
|
| 28 |
text_model_path: str
|
| 29 |
+
text_model_repo_id: str
|
| 30 |
+
text_model_filename: str
|
| 31 |
+
text_model_revision: str
|
| 32 |
vision_model_id: str
|
| 33 |
trace_output_dir: Path
|
| 34 |
|
|
|
|
| 39 |
vision_backend=env.get("OBJECTVERSE_VISION_BACKEND", "mock"),
|
| 40 |
text_backend=env.get("OBJECTVERSE_TEXT_BACKEND", "mock"),
|
| 41 |
text_model_path=env.get("TEXT_MODEL_PATH", ""),
|
| 42 |
+
text_model_repo_id=env.get("TEXT_MODEL_REPO_ID", ""),
|
| 43 |
+
text_model_filename=env.get("TEXT_MODEL_FILENAME", ""),
|
| 44 |
+
text_model_revision=env.get("TEXT_MODEL_REVISION", ""),
|
| 45 |
vision_model_id=env.get("VISION_MODEL_ID", ""),
|
| 46 |
trace_output_dir=Path(env.get("TRACE_OUTPUT_DIR", "data/traces")),
|
| 47 |
)
|
|
|
|
| 67 |
if text_backend == "mock":
|
| 68 |
runtime_parts.append("no llama.cpp model connected yet")
|
| 69 |
else:
|
| 70 |
+
runtime_parts.append(f"text model source: {_text_model_source_status(current)}")
|
| 71 |
runtime = "; ".join(runtime_parts)
|
| 72 |
return {"vision": vision, "text": text, "runtime": runtime}
|
| 73 |
|
| 74 |
|
| 75 |
+
def _text_model_source_status(settings: RuntimeSettings) -> str:
|
| 76 |
+
if settings.text_model_path.strip():
|
| 77 |
+
return "[configured external GGUF]"
|
| 78 |
+
repo_id = settings.text_model_repo_id.strip()
|
| 79 |
+
filename = settings.text_model_filename.strip()
|
| 80 |
+
if repo_id and filename:
|
| 81 |
+
revision = settings.text_model_revision.strip()
|
| 82 |
+
suffix = f"@{revision}" if revision else ""
|
| 83 |
+
return f"Hub GGUF: {repo_id}/{filename}{suffix}"
|
| 84 |
+
return "[not configured]"
|
| 85 |
|
| 86 |
|
| 87 |
SETTINGS = get_runtime_settings()
|
src/models/llama_cpp_runner.py
CHANGED
|
@@ -8,7 +8,11 @@ from typing import Any
|
|
| 8 |
|
| 9 |
from src.config import RuntimeSettings, get_runtime_settings
|
| 10 |
from src.models.schema import DiaryEntry, ObjectUnderstanding, Persona, PersonaEnvelope
|
| 11 |
-
from src.prompts.diary_generation import
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
from src.prompts.persona_generation import PERSONA_GENERATION_PROMPT
|
| 13 |
from src.utils.json_repair import parse_json_object
|
| 14 |
|
|
@@ -61,6 +65,22 @@ def generate_persona(object_understanding: ObjectUnderstanding, mode: str) -> Pe
|
|
| 61 |
return _generate_persona_mock(object_understanding, mode)
|
| 62 |
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
def generate_diary(persona: PersonaEnvelope, mode: str) -> DiaryEntry:
|
| 65 |
settings = get_runtime_settings()
|
| 66 |
if _is_llama_cpp_backend(settings) and TEXT_FALLBACK_TO_MOCK not in _TEXT_FALLBACKS:
|
|
@@ -164,6 +184,25 @@ def _generate_persona_llama_cpp(
|
|
| 164 |
return PersonaEnvelope.model_validate(raw)
|
| 165 |
|
| 166 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 167 |
def _generate_diary_llama_cpp(
|
| 168 |
persona: PersonaEnvelope,
|
| 169 |
mode: str,
|
|
@@ -209,7 +248,7 @@ def _run_llama_json(
|
|
| 209 |
settings: RuntimeSettings,
|
| 210 |
max_tokens: int,
|
| 211 |
) -> dict[str, Any]:
|
| 212 |
-
model = _load_llama_model(settings.text_model_path)
|
| 213 |
user_content = json.dumps(user_payload, ensure_ascii=False, indent=2)
|
| 214 |
raw = _complete_llama(
|
| 215 |
model,
|
|
@@ -234,7 +273,8 @@ def _complete_llama(
|
|
| 234 |
{"role": "system", "content": system_prompt},
|
| 235 |
{"role": "user", "content": user_content},
|
| 236 |
],
|
| 237 |
-
temperature=0.
|
|
|
|
| 238 |
max_tokens=max_tokens,
|
| 239 |
stop=stop,
|
| 240 |
)
|
|
@@ -243,7 +283,8 @@ def _complete_llama(
|
|
| 243 |
prompt = f"System:\n{system_prompt}\n\nUser:\n{user_content}\n\nAssistant JSON:\n"
|
| 244 |
response = model(
|
| 245 |
prompt,
|
| 246 |
-
temperature=0.
|
|
|
|
| 247 |
max_tokens=max_tokens,
|
| 248 |
stop=stop,
|
| 249 |
)
|
|
@@ -272,12 +313,10 @@ def _extract_completion_text(response: Any) -> str:
|
|
| 272 |
raise ValueError("llama.cpp response did not include text content.")
|
| 273 |
|
| 274 |
|
| 275 |
-
def _load_llama_model(text_model_path: str) -> Any:
|
| 276 |
global _LLAMA_MODEL, _LLAMA_MODEL_PATH
|
| 277 |
|
| 278 |
-
clean_path =
|
| 279 |
-
if not clean_path:
|
| 280 |
-
raise ValueError("TEXT_MODEL_PATH is not configured.")
|
| 281 |
if not Path(clean_path).exists():
|
| 282 |
raise FileNotFoundError(f"TEXT_MODEL_PATH does not exist: {clean_path}")
|
| 283 |
|
|
@@ -295,6 +334,38 @@ def _load_llama_model(text_model_path: str) -> Any:
|
|
| 295 |
return _LLAMA_MODEL
|
| 296 |
|
| 297 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 298 |
def _is_llama_cpp_backend(settings: RuntimeSettings) -> bool:
|
| 299 |
return settings.text_backend.strip().lower() in LLAMA_CPP_BACKENDS
|
| 300 |
|
|
|
|
| 8 |
|
| 9 |
from src.config import RuntimeSettings, get_runtime_settings
|
| 10 |
from src.models.schema import DiaryEntry, ObjectUnderstanding, Persona, PersonaEnvelope
|
| 11 |
+
from src.prompts.diary_generation import (
|
| 12 |
+
CHAT_REPLY_PROMPT,
|
| 13 |
+
DIARY_GENERATION_PROMPT,
|
| 14 |
+
PERSONA_DIARY_GENERATION_PROMPT,
|
| 15 |
+
)
|
| 16 |
from src.prompts.persona_generation import PERSONA_GENERATION_PROMPT
|
| 17 |
from src.utils.json_repair import parse_json_object
|
| 18 |
|
|
|
|
| 65 |
return _generate_persona_mock(object_understanding, mode)
|
| 66 |
|
| 67 |
|
| 68 |
+
def generate_persona_and_diary(
|
| 69 |
+
object_understanding: ObjectUnderstanding,
|
| 70 |
+
mode: str,
|
| 71 |
+
) -> tuple[PersonaEnvelope, DiaryEntry]:
|
| 72 |
+
settings = get_runtime_settings()
|
| 73 |
+
if _is_llama_cpp_backend(settings):
|
| 74 |
+
try:
|
| 75 |
+
return _generate_persona_and_diary_llama_cpp(object_understanding, mode, settings)
|
| 76 |
+
except Exception as exc:
|
| 77 |
+
_log_text_fallback("persona+diary", exc)
|
| 78 |
+
_add_text_fallback(TEXT_FALLBACK_TO_MOCK)
|
| 79 |
+
|
| 80 |
+
persona = _generate_persona_mock(object_understanding, mode)
|
| 81 |
+
return persona, _generate_diary_mock(persona, mode)
|
| 82 |
+
|
| 83 |
+
|
| 84 |
def generate_diary(persona: PersonaEnvelope, mode: str) -> DiaryEntry:
|
| 85 |
settings = get_runtime_settings()
|
| 86 |
if _is_llama_cpp_backend(settings) and TEXT_FALLBACK_TO_MOCK not in _TEXT_FALLBACKS:
|
|
|
|
| 184 |
return PersonaEnvelope.model_validate(raw)
|
| 185 |
|
| 186 |
|
| 187 |
+
def _generate_persona_and_diary_llama_cpp(
|
| 188 |
+
object_understanding: ObjectUnderstanding,
|
| 189 |
+
mode: str,
|
| 190 |
+
settings: RuntimeSettings,
|
| 191 |
+
) -> tuple[PersonaEnvelope, DiaryEntry]:
|
| 192 |
+
raw = _run_llama_json(
|
| 193 |
+
system_prompt=PERSONA_DIARY_GENERATION_PROMPT,
|
| 194 |
+
user_payload={
|
| 195 |
+
"mode": mode,
|
| 196 |
+
"object_understanding": object_understanding.model_dump(mode="json"),
|
| 197 |
+
},
|
| 198 |
+
settings=settings,
|
| 199 |
+
max_tokens=1024,
|
| 200 |
+
)
|
| 201 |
+
persona = PersonaEnvelope.model_validate({"persona": raw.get("persona")})
|
| 202 |
+
diary = DiaryEntry.model_validate(raw.get("diary"))
|
| 203 |
+
return persona, diary
|
| 204 |
+
|
| 205 |
+
|
| 206 |
def _generate_diary_llama_cpp(
|
| 207 |
persona: PersonaEnvelope,
|
| 208 |
mode: str,
|
|
|
|
| 248 |
settings: RuntimeSettings,
|
| 249 |
max_tokens: int,
|
| 250 |
) -> dict[str, Any]:
|
| 251 |
+
model = _load_llama_model(settings.text_model_path, settings=settings)
|
| 252 |
user_content = json.dumps(user_payload, ensure_ascii=False, indent=2)
|
| 253 |
raw = _complete_llama(
|
| 254 |
model,
|
|
|
|
| 273 |
{"role": "system", "content": system_prompt},
|
| 274 |
{"role": "user", "content": user_content},
|
| 275 |
],
|
| 276 |
+
temperature=0.2,
|
| 277 |
+
top_p=0.9,
|
| 278 |
max_tokens=max_tokens,
|
| 279 |
stop=stop,
|
| 280 |
)
|
|
|
|
| 283 |
prompt = f"System:\n{system_prompt}\n\nUser:\n{user_content}\n\nAssistant JSON:\n"
|
| 284 |
response = model(
|
| 285 |
prompt,
|
| 286 |
+
temperature=0.2,
|
| 287 |
+
top_p=0.9,
|
| 288 |
max_tokens=max_tokens,
|
| 289 |
stop=stop,
|
| 290 |
)
|
|
|
|
| 313 |
raise ValueError("llama.cpp response did not include text content.")
|
| 314 |
|
| 315 |
|
| 316 |
+
def _load_llama_model(text_model_path: str, *, settings: RuntimeSettings | None = None) -> Any:
|
| 317 |
global _LLAMA_MODEL, _LLAMA_MODEL_PATH
|
| 318 |
|
| 319 |
+
clean_path = _resolve_text_model_path(text_model_path, settings)
|
|
|
|
|
|
|
| 320 |
if not Path(clean_path).exists():
|
| 321 |
raise FileNotFoundError(f"TEXT_MODEL_PATH does not exist: {clean_path}")
|
| 322 |
|
|
|
|
| 334 |
return _LLAMA_MODEL
|
| 335 |
|
| 336 |
|
| 337 |
+
def _resolve_text_model_path(
|
| 338 |
+
text_model_path: str,
|
| 339 |
+
settings: RuntimeSettings | None = None,
|
| 340 |
+
) -> str:
|
| 341 |
+
clean_path = text_model_path.strip()
|
| 342 |
+
if clean_path:
|
| 343 |
+
return clean_path
|
| 344 |
+
|
| 345 |
+
current = settings or get_runtime_settings()
|
| 346 |
+
if current.text_model_repo_id.strip() and current.text_model_filename.strip():
|
| 347 |
+
return _download_hf_gguf(current)
|
| 348 |
+
|
| 349 |
+
raise ValueError(
|
| 350 |
+
"TEXT_MODEL_PATH is not configured, and TEXT_MODEL_REPO_ID/TEXT_MODEL_FILENAME "
|
| 351 |
+
"are not configured."
|
| 352 |
+
)
|
| 353 |
+
|
| 354 |
+
|
| 355 |
+
def _download_hf_gguf(settings: RuntimeSettings) -> str:
|
| 356 |
+
from huggingface_hub import hf_hub_download
|
| 357 |
+
|
| 358 |
+
kwargs: dict[str, str] = {
|
| 359 |
+
"repo_id": settings.text_model_repo_id.strip(),
|
| 360 |
+
"filename": settings.text_model_filename.strip(),
|
| 361 |
+
"repo_type": "model",
|
| 362 |
+
}
|
| 363 |
+
revision = settings.text_model_revision.strip()
|
| 364 |
+
if revision:
|
| 365 |
+
kwargs["revision"] = revision
|
| 366 |
+
return hf_hub_download(**kwargs)
|
| 367 |
+
|
| 368 |
+
|
| 369 |
def _is_llama_cpp_backend(settings: RuntimeSettings) -> bool:
|
| 370 |
return settings.text_backend.strip().lower() in LLAMA_CPP_BACKENDS
|
| 371 |
|
tests/test_mock_mvp.py
CHANGED
|
@@ -3,11 +3,14 @@
|
|
| 3 |
from __future__ import annotations
|
| 4 |
|
| 5 |
import json
|
|
|
|
| 6 |
import tempfile
|
|
|
|
| 7 |
import unittest
|
| 8 |
from pathlib import Path
|
| 9 |
from unittest.mock import patch
|
| 10 |
|
|
|
|
| 11 |
from src.example_cache import load_sample_generation, sample_trace_path
|
| 12 |
from src.examples import EXAMPLE_OBJECTS, gradio_examples
|
| 13 |
from src.models.llama_cpp_runner import (
|
|
@@ -39,8 +42,10 @@ class FakeMiniCpmModel:
|
|
| 39 |
class FakeLlamaModel:
|
| 40 |
def __init__(self, responses: list[str]) -> None:
|
| 41 |
self.responses = responses
|
|
|
|
| 42 |
|
| 43 |
def create_chat_completion(self, **_: object) -> dict:
|
|
|
|
| 44 |
response = self.responses.pop(0)
|
| 45 |
return {"choices": [{"message": {"content": response}}]}
|
| 46 |
|
|
@@ -72,6 +77,62 @@ class MockMvpTest(unittest.TestCase):
|
|
| 72 |
self.assertIn("[configured external GGUF]", status["runtime"])
|
| 73 |
self.assertNotIn("/Users/leo", status["runtime"])
|
| 74 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
def test_examples_cover_six_objects(self) -> None:
|
| 76 |
self.assertEqual(len(EXAMPLE_OBJECTS), 6)
|
| 77 |
self.assertEqual(len(gradio_examples()), 6)
|
|
@@ -177,6 +238,45 @@ class MockMvpTest(unittest.TestCase):
|
|
| 177 |
self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
|
| 178 |
self.assertEqual(result.trace.model_runtime["text"], "llama-cpp text generation")
|
| 179 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 180 |
def test_minicpm_vision_backend_accepts_valid_json(self) -> None:
|
| 181 |
response = """
|
| 182 |
{"object":{"name":"coffee mug","visible_features":["white ceramic","round handle","desk shadow"],"likely_context":"work desk","confidence":0.88}}
|
|
|
|
| 3 |
from __future__ import annotations
|
| 4 |
|
| 5 |
import json
|
| 6 |
+
import sys
|
| 7 |
import tempfile
|
| 8 |
+
import types
|
| 9 |
import unittest
|
| 10 |
from pathlib import Path
|
| 11 |
from unittest.mock import patch
|
| 12 |
|
| 13 |
+
import src.models.llama_cpp_runner as llama_cpp_runner
|
| 14 |
from src.example_cache import load_sample_generation, sample_trace_path
|
| 15 |
from src.examples import EXAMPLE_OBJECTS, gradio_examples
|
| 16 |
from src.models.llama_cpp_runner import (
|
|
|
|
| 42 |
class FakeLlamaModel:
|
| 43 |
def __init__(self, responses: list[str]) -> None:
|
| 44 |
self.responses = responses
|
| 45 |
+
self.calls = 0
|
| 46 |
|
| 47 |
def create_chat_completion(self, **_: object) -> dict:
|
| 48 |
+
self.calls += 1
|
| 49 |
response = self.responses.pop(0)
|
| 50 |
return {"choices": [{"message": {"content": response}}]}
|
| 51 |
|
|
|
|
| 77 |
self.assertIn("[configured external GGUF]", status["runtime"])
|
| 78 |
self.assertNotIn("/Users/leo", status["runtime"])
|
| 79 |
|
| 80 |
+
def test_llama_cpp_hub_runtime_status_uses_public_repo_summary(self) -> None:
|
| 81 |
+
settings = get_runtime_settings(
|
| 82 |
+
{
|
| 83 |
+
"OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
|
| 84 |
+
"TEXT_MODEL_REPO_ID": "qqyule/objectverse-diary-qwen15b-lora",
|
| 85 |
+
"TEXT_MODEL_FILENAME": "objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf",
|
| 86 |
+
}
|
| 87 |
+
)
|
| 88 |
+
|
| 89 |
+
status = runtime_status(settings)
|
| 90 |
+
|
| 91 |
+
self.assertEqual(settings.text_model_repo_id, "qqyule/objectverse-diary-qwen15b-lora")
|
| 92 |
+
self.assertEqual(settings.text_model_filename, "objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf")
|
| 93 |
+
self.assertIn("Hub GGUF", status["runtime"])
|
| 94 |
+
self.assertIn("qqyule/objectverse-diary-qwen15b-lora", status["runtime"])
|
| 95 |
+
self.assertNotIn("/home", status["runtime"])
|
| 96 |
+
self.assertNotIn("/Users", status["runtime"])
|
| 97 |
+
|
| 98 |
+
def test_llama_cpp_loads_model_from_hub_config_when_path_is_missing(self) -> None:
|
| 99 |
+
previous_model = llama_cpp_runner._LLAMA_MODEL
|
| 100 |
+
previous_path = llama_cpp_runner._LLAMA_MODEL_PATH
|
| 101 |
+
llama_cpp_runner._LLAMA_MODEL = None
|
| 102 |
+
llama_cpp_runner._LLAMA_MODEL_PATH = None
|
| 103 |
+
|
| 104 |
+
loaded_paths: list[str] = []
|
| 105 |
+
|
| 106 |
+
class FakeLlama:
|
| 107 |
+
def __init__(self, *, model_path: str, **_: object) -> None:
|
| 108 |
+
loaded_paths.append(model_path)
|
| 109 |
+
|
| 110 |
+
fake_module = types.ModuleType("llama_cpp")
|
| 111 |
+
fake_module.Llama = FakeLlama
|
| 112 |
+
|
| 113 |
+
try:
|
| 114 |
+
with tempfile.TemporaryDirectory() as tmp_dir:
|
| 115 |
+
model_path = Path(tmp_dir) / "model.gguf"
|
| 116 |
+
model_path.write_bytes(b"GGUF")
|
| 117 |
+
settings = get_runtime_settings(
|
| 118 |
+
{
|
| 119 |
+
"OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
|
| 120 |
+
"TEXT_MODEL_REPO_ID": "qqyule/objectverse-diary-qwen15b-lora",
|
| 121 |
+
"TEXT_MODEL_FILENAME": "objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf",
|
| 122 |
+
}
|
| 123 |
+
)
|
| 124 |
+
|
| 125 |
+
with (
|
| 126 |
+
patch.dict(sys.modules, {"llama_cpp": fake_module}),
|
| 127 |
+
patch("src.models.llama_cpp_runner._download_hf_gguf", return_value=str(model_path)),
|
| 128 |
+
):
|
| 129 |
+
llama_cpp_runner._load_llama_model("", settings=settings)
|
| 130 |
+
|
| 131 |
+
self.assertEqual(loaded_paths, [str(model_path)])
|
| 132 |
+
finally:
|
| 133 |
+
llama_cpp_runner._LLAMA_MODEL = previous_model
|
| 134 |
+
llama_cpp_runner._LLAMA_MODEL_PATH = previous_path
|
| 135 |
+
|
| 136 |
def test_examples_cover_six_objects(self) -> None:
|
| 137 |
self.assertEqual(len(EXAMPLE_OBJECTS), 6)
|
| 138 |
self.assertEqual(len(gradio_examples()), 6)
|
|
|
|
| 238 |
self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
|
| 239 |
self.assertEqual(result.trace.model_runtime["text"], "llama-cpp text generation")
|
| 240 |
|
| 241 |
+
def test_pipeline_uses_combined_llama_cpp_persona_and_diary(self) -> None:
|
| 242 |
+
env = {
|
| 243 |
+
"OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
|
| 244 |
+
"TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
|
| 245 |
+
}
|
| 246 |
+
fake_llama = FakeLlamaModel(
|
| 247 |
+
[
|
| 248 |
+
"""
|
| 249 |
+
{
|
| 250 |
+
"persona": {
|
| 251 |
+
"object_name": "coffee mug",
|
| 252 |
+
"character_name": "Mugworth",
|
| 253 |
+
"mood": "dry and suspicious",
|
| 254 |
+
"secret_fear": "being left empty forever",
|
| 255 |
+
"core_memory": "It remembers every late-night refill.",
|
| 256 |
+
"complaint": "I am treated like a ceramic fuel tank.",
|
| 257 |
+
"tags": ["desk witness", "warm archive", "quiet judgment"]
|
| 258 |
+
},
|
| 259 |
+
"diary": {
|
| 260 |
+
"title": "Secret Diary - Day 418",
|
| 261 |
+
"english": "Today I held another bitter storm and called it service.",
|
| 262 |
+
"chinese": "今天我又装下一场苦涩风暴,并被称为有用。"
|
| 263 |
+
}
|
| 264 |
+
}
|
| 265 |
+
""",
|
| 266 |
+
]
|
| 267 |
+
)
|
| 268 |
+
|
| 269 |
+
with (
|
| 270 |
+
patch.dict("os.environ", env, clear=False),
|
| 271 |
+
patch("src.models.llama_cpp_runner._load_llama_model", return_value=fake_llama),
|
| 272 |
+
):
|
| 273 |
+
result = generate_object_diary(None, "old white coffee mug", "Cynical", save=False)
|
| 274 |
+
|
| 275 |
+
self.assertEqual(result.persona.persona.character_name, "Mugworth")
|
| 276 |
+
self.assertEqual(result.diary.title, "Secret Diary - Day 418")
|
| 277 |
+
self.assertEqual(fake_llama.calls, 1)
|
| 278 |
+
self.assertNotIn("text-fallback-to-mock", result.trace.fallbacks)
|
| 279 |
+
|
| 280 |
def test_minicpm_vision_backend_accepts_valid_json(self) -> None:
|
| 281 |
response = """
|
| 282 |
{"object":{"name":"coffee mug","visible_features":["white ceramic","round handle","desk shadow"],"likely_context":"work desk","confidence":0.88}}
|