Commit History

Update README.md
7c640d5
verified

unity4ar commited on

Strip MiniCPM <think>...</think> reasoning tags from generated text
5f5ab47
verified

unity4ar commited on

Relax huggingface_hub pin to <1.0 to satisfy transformers 4.55-4.57
fdb54cf
verified

unity4ar commited on

Pin transformers<5.0 so KV cache works with MiniCPM bundled code; re-enable use_cache
148a99f
verified

unity4ar commited on

Revert to use_cache=False (eager attn also broken); hide all audio UI
df99cf8
verified

unity4ar commited on

Speed up: eager attn + KV cache; drop chat retries to 1; remove MiniCPM-o voice UI artifact
3326f32
verified

unity4ar commited on

Cap zerogpu max_new_tokens at 256 (use_cache=False makes long generations O(n^2))
c788ee1
verified

unity4ar commited on

Disable KV cache: openbmb modeling_minicpm.py has a cache_utils API drift bug
e06599a
verified

unity4ar commited on

Log + surface RuntimeErrors from witness chat too (still 503 for those)
a5ce744
verified

unity4ar commited on

Surface witness chat failures with traceback + error class in 500 detail
5b4e454
verified

unity4ar commited on

Add the 4 map layer PNGs missing from the initial Docker-era ship (ship_space.sh excluded data/*.png)
46e31a7
verified

unity4ar commited on

Move .to('cuda') inside @spaces.GPU; background thread keeps model on CPU to avoid emulation bypass
5a8fab5
verified

unity4ar commited on

Shim is_torch_fx_available so MiniCPM trust_remote_code import works on transformers >= 5.0
90e360d
verified

unity4ar commited on

Use canonical .to('cuda') pattern + progress logs so container log shows what loader is doing
031ce2d
verified

unity4ar commited on

Surface zerogpu backend load_error / load detail in setup status
47c194f
verified

unity4ar commited on

Eagerly import zerogpu_backend on Spaces so @spaces.GPU is registered before startup scan
50761af
verified

unity4ar commited on

Force demo.launch to bind 0.0.0.0:$PORT on HF Spaces (CLI hot-reload ignores GRADIO_SERVER_NAME)
4f670ea
verified

unity4ar commited on

Load model in background thread so health/status endpoints don't block on 16GB download
fbd952d
verified

unity4ar commited on

Expose `demo` at module scope so Gradio SDK runner can launch the gr.Server app
5cb944e
verified

unity4ar commited on

Default provider to zerogpu_transformers on HF Spaces; drop bogus README env block
e9ef2b5
verified

unity4ar commited on

Gate setup/llama subprocess paths behind provider check; allow zerogpu_transformers
50a467b
verified

unity4ar commited on

Load model on cuda at module level (canonical ZeroGPU pattern)
a475083
verified

unity4ar commited on

Refactor: Docker+llama.cpp -> Gradio SDK + ZeroGPU transformers backend
7036a02
verified

unity4ar commited on

Fix port collision: scope PORT env to llama.cpp subprocess
16501bb
verified

unity4ar commited on

Add Space README
135a74d
verified

unity4ar commited on

Ship Phantom Grid Docker Space
d2e6f94
verified

unity4ar commited on

initial commit
1561677
verified

unity4ar commited on