Spaces:

build-small-hackathon
/

PregoPal

Runtime error

App Files Files Community

PregoPal / modal_deploy

Commit History

fix: remove --voxcpm2-base-lm/acoustic args (not valid for llama-server CLI)

6230c98

J.B-Lin commited on 23 days ago

test_omni: add stderr capture on startup failure

e770d07

J.B-Lin commited on 23 days ago

fix: rewrite test_omni to use llama-server built-in omni endpoints

d67ae6b

J.B-Lin commited on 23 days ago

refactor: deprecate llama-omni-server, use llama-server built-in omni endpoints

6451051

J.B-Lin commited on 23 days ago

fix: set n_parallel=1, n_ctx=4096 before omni_init to prevent n_seq_max > 256

e0c2f3a

J.B-Lin commited on 23 days ago

test_omni: add stderr capture for omni-server when init fails

4429886

J.B-Lin commited on 23 days ago

fix: params.model is common_params_model struct, use .path

a94fb90

J.B-Lin commited on 23 days ago

tmp: health check / omni test scripts for debugging omni_init 500

c86f455

J.B-Lin commited on 23 days ago

fix: set params.model/vpm/apm/tts in server-omni.cpp omni_init handler so model files are found

a850bb4

J.B-Lin commited on 23 days ago

modal/omni: add llama-omni-server for full-duplex voice API

fedf97a

J.B-Lin commited on 24 days ago

modal/omni: text inference verified (Chinese OK, English needs fix)

2400b46

J.B-Lin commited on 24 days ago

modal/omni: llama-server inference verified on T4

0e9e62e

J.B-Lin commited on 24 days ago

chore: add build_llama_server.sh (local cross-compile script, unused)

12a8a38

J.B-Lin commited on 24 days ago

deploy_omni: llFile llama.cpp-omni compile succeeds on Modal T4

1a07461

J.B-Lin commited on 24 days ago

用户手动备份

0d9e374

J.B-Lin commited on 25 days ago

用户手动整理文件，并备份deploy_backup.py文件

68d23d2

J.B-Lin commited on 25 days ago

core: 更新 ModelLoader 对接远端 Modal API + README 同步

8193cd7

J.B-Lin commited on 25 days ago

modal: 降级 GPU A100 -> T4 节省成本

2d207a7

J.B-Lin commited on 25 days ago

deploy: 成功部署 MiniCPM-o-4_5 到 Modal (预编译llama-cpp-python A100)

c0f7bc8

J.B-Lin commited on 25 days ago

docs: pre-refactor backup - before tech log integration

d14aeea

J.B-Lin commited on 26 days ago

fix(deploy): 使用预编译CUDA wheel替代源码编译

aa40303

J.B-Lin commited on 26 days ago

重构modal部署使用CUDA编译, streaming修复, cookbook_ref gitignored

fb89443

J.B-Lin commited on 26 days ago

[modal] Fix: mmproj single-file, health endpoint, client API consistency

f10673a

J.B-Lin commited on 26 days ago

fix(deploy.py): 修复4个问题 - MODEL_DIR shadowing, find_mmproj_files硬编码, MainModel传参, 移除--multimodal flag

c8e1c13

J.B-Lin commited on 26 days ago

fix: CUDA runtime LD_LIBRARY_PATH + CUDA_HOME in Modal deploy; add inference test script

ea75111

J.B-Lin commited on 26 days ago

feat(modal): 部署 MiniCPM-o 4.5 到 Modal (llama.cpp + FastAPI)

09159ca

J.B-Lin commited on 26 days ago

feat: llama.cpp deploys MiniCPM-o 4.5 on Modal - technical report and client API

ff5d4ed

J.B-Lin commited on 26 days ago

Commit History

fix: remove --voxcpm2-base-lm/acoustic args (not valid for llama-server CLI) 6230c98

test_omni: add stderr capture on startup failure e770d07

fix: rewrite test_omni to use llama-server built-in omni endpoints d67ae6b

refactor: deprecate llama-omni-server, use llama-server built-in omni endpoints 6451051

fix: set n_parallel=1, n_ctx=4096 before omni_init to prevent n_seq_max > 256 e0c2f3a

test_omni: add stderr capture for omni-server when init fails 4429886

fix: params.model is common_params_model struct, use .path a94fb90

tmp: health check / omni test scripts for debugging omni_init 500 c86f455

fix: set params.model/vpm/apm/tts in server-omni.cpp omni_init handler so model files are found a850bb4

modal/omni: add llama-omni-server for full-duplex voice API fedf97a

modal/omni: text inference verified (Chinese OK, English needs fix) 2400b46

modal/omni: llama-server inference verified on T4 0e9e62e

chore: add build_llama_server.sh (local cross-compile script, unused) 12a8a38

deploy_omni: llFile llama.cpp-omni compile succeeds on Modal T4 1a07461

用户手动备份 0d9e374

用户手动整理文件，并备份deploy_backup.py文件 68d23d2

core: 更新 ModelLoader 对接远端 Modal API + README 同步 8193cd7

modal: 降级 GPU A100 -> T4 节省成本 2d207a7

deploy: 成功部署 MiniCPM-o-4_5 到 Modal (预编译llama-cpp-python A100) c0f7bc8

docs: pre-refactor backup - before tech log integration d14aeea

fix(deploy): 使用预编译CUDA wheel替代源码编译 aa40303

重构modal部署使用CUDA编译, streaming修复, cookbook_ref gitignored fb89443

[modal] Fix: mmproj single-file, health endpoint, client API consistency f10673a

fix(deploy.py): 修复4个问题 - MODEL_DIR shadowing, find_mmproj_files硬编码, MainModel传参, 移除--multimodal flag c8e1c13

fix: CUDA runtime LD_LIBRARY_PATH + CUDA_HOME in Modal deploy; add inference test script ea75111

feat(modal): 部署 MiniCPM-o 4.5 到 Modal (llama.cpp + FastAPI) 09159ca

feat: llama.cpp deploys MiniCPM-o 4.5 on Modal - technical report and client API ff5d4ed

fix: remove --voxcpm2-base-lm/acoustic args (not valid for llama-server CLI)

6230c98

test_omni: add stderr capture on startup failure

e770d07

fix: rewrite test_omni to use llama-server built-in omni endpoints

d67ae6b

refactor: deprecate llama-omni-server, use llama-server built-in omni endpoints

6451051

fix: set n_parallel=1, n_ctx=4096 before omni_init to prevent n_seq_max > 256

e0c2f3a

test_omni: add stderr capture for omni-server when init fails

4429886

fix: params.model is common_params_model struct, use .path

a94fb90

tmp: health check / omni test scripts for debugging omni_init 500

c86f455

fix: set params.model/vpm/apm/tts in server-omni.cpp omni_init handler so model files are found

a850bb4

modal/omni: add llama-omni-server for full-duplex voice API

fedf97a

modal/omni: text inference verified (Chinese OK, English needs fix)

2400b46

modal/omni: llama-server inference verified on T4

0e9e62e

chore: add build_llama_server.sh (local cross-compile script, unused)

12a8a38

deploy_omni: llFile llama.cpp-omni compile succeeds on Modal T4

1a07461

用户手动备份

0d9e374

用户手动整理文件，并备份deploy_backup.py文件

68d23d2

core: 更新 ModelLoader 对接远端 Modal API + README 同步

8193cd7

modal: 降级 GPU A100 -> T4 节省成本

2d207a7

deploy: 成功部署 MiniCPM-o-4_5 到 Modal (预编译llama-cpp-python A100)

c0f7bc8

docs: pre-refactor backup - before tech log integration

d14aeea

fix(deploy): 使用预编译CUDA wheel替代源码编译

aa40303

重构modal部署使用CUDA编译, streaming修复, cookbook_ref gitignored

fb89443

[modal] Fix: mmproj single-file, health endpoint, client API consistency

f10673a

fix(deploy.py): 修复4个问题 - MODEL_DIR shadowing, find_mmproj_files硬编码, MainModel传参, 移除--multimodal flag

c8e1c13

fix: CUDA runtime LD_LIBRARY_PATH + CUDA_HOME in Modal deploy; add inference test script

ea75111

feat(modal): 部署 MiniCPM-o 4.5 到 Modal (llama.cpp + FastAPI)

09159ca

feat: llama.cpp deploys MiniCPM-o 4.5 on Modal - technical report and client API

ff5d4ed