PregoPal / modal_deploy /deploy.py

Commit History

modal: 降级 GPU A100 -> T4 节省成本
2d207a7

J.B-Lin commited on

deploy: 成功部署 MiniCPM-o-4_5 到 Modal (预编译llama-cpp-python A100)
c0f7bc8

J.B-Lin commited on

docs: pre-refactor backup - before tech log integration
d14aeea

J.B-Lin commited on

fix(deploy): 使用预编译CUDA wheel替代源码编译
aa40303

J.B-Lin commited on

重构modal部署使用CUDA编译, streaming修复, cookbook_ref gitignored
fb89443

J.B-Lin commited on

[modal] Fix: mmproj single-file, health endpoint, client API consistency
f10673a

J.B-Lin commited on

fix(deploy.py): 修复4个问题 - MODEL_DIR shadowing, find_mmproj_files硬编码, MainModel传参, 移除--multimodal flag
c8e1c13

J.B-Lin commited on

fix: CUDA runtime LD_LIBRARY_PATH + CUDA_HOME in Modal deploy; add inference test script
ea75111

J.B-Lin commited on

feat(modal): 部署 MiniCPM-o 4.5 到 Modal (llama.cpp + FastAPI)
09159ca

J.B-Lin commited on

feat: llama.cpp deploys MiniCPM-o 4.5 on Modal - technical report and client API
ff5d4ed

J.B-Lin commited on