PregoPal / modal_deploy

Commit History

fix: remove --voxcpm2-base-lm/acoustic args (not valid for llama-server CLI)
6230c98

J.B-Lin commited on

test_omni: add stderr capture on startup failure
e770d07

J.B-Lin commited on

fix: rewrite test_omni to use llama-server built-in omni endpoints
d67ae6b

J.B-Lin commited on

refactor: deprecate llama-omni-server, use llama-server built-in omni endpoints
6451051

J.B-Lin commited on

fix: set n_parallel=1, n_ctx=4096 before omni_init to prevent n_seq_max > 256
e0c2f3a

J.B-Lin commited on

test_omni: add stderr capture for omni-server when init fails
4429886

J.B-Lin commited on

fix: params.model is common_params_model struct, use .path
a94fb90

J.B-Lin commited on

tmp: health check / omni test scripts for debugging omni_init 500
c86f455

J.B-Lin commited on

fix: set params.model/vpm/apm/tts in server-omni.cpp omni_init handler so model files are found
a850bb4

J.B-Lin commited on

modal/omni: add llama-omni-server for full-duplex voice API
fedf97a

J.B-Lin commited on

modal/omni: text inference verified (Chinese OK, English needs fix)
2400b46

J.B-Lin commited on

modal/omni: llama-server inference verified on T4
0e9e62e

J.B-Lin commited on

chore: add build_llama_server.sh (local cross-compile script, unused)
12a8a38

J.B-Lin commited on

deploy_omni: llFile llama.cpp-omni compile succeeds on Modal T4
1a07461

J.B-Lin commited on

用户手动备份
0d9e374

J.B-Lin commited on

用户手动整理文件,并备份deploy_backup.py文件
68d23d2

J.B-Lin commited on

core: 更新 ModelLoader 对接远端 Modal API + README 同步
8193cd7

J.B-Lin commited on

modal: 降级 GPU A100 -> T4 节省成本
2d207a7

J.B-Lin commited on

deploy: 成功部署 MiniCPM-o-4_5 到 Modal (预编译llama-cpp-python A100)
c0f7bc8

J.B-Lin commited on

docs: pre-refactor backup - before tech log integration
d14aeea

J.B-Lin commited on

fix(deploy): 使用预编译CUDA wheel替代源码编译
aa40303

J.B-Lin commited on

重构modal部署使用CUDA编译, streaming修复, cookbook_ref gitignored
fb89443

J.B-Lin commited on

[modal] Fix: mmproj single-file, health endpoint, client API consistency
f10673a

J.B-Lin commited on

fix(deploy.py): 修复4个问题 - MODEL_DIR shadowing, find_mmproj_files硬编码, MainModel传参, 移除--multimodal flag
c8e1c13

J.B-Lin commited on

fix: CUDA runtime LD_LIBRARY_PATH + CUDA_HOME in Modal deploy; add inference test script
ea75111

J.B-Lin commited on

feat(modal): 部署 MiniCPM-o 4.5 到 Modal (llama.cpp + FastAPI)
09159ca

J.B-Lin commited on

feat: llama.cpp deploys MiniCPM-o 4.5 on Modal - technical report and client API
ff5d4ed

J.B-Lin commited on