|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- Image-Text-to-Text Models |
|
|
- Audio-Text-to-Text |
|
|
- text-to-Text |
|
|
- llama_cpp |
|
|
- any to any |
|
|
- Multimodal AI |
|
|
- video-Text-to-Text |
|
|
- Llama Model Switcher |
|
|
--- |
|
|
nzgnzg73 |
|
|
|
|
|
llama_cpp_WebUI |
|
|
|
|
|
Github |
|
|
https://github.com/nzgnzg73/llama_cpp_WebUI |
|
|
|
|
|
## Want to talk or ask something? |
|
|
Just click the YouTube link below! You'll find my π§ email there and can message me easily. π |
|
|
|
|
|
π₯ YouTube Channel: @nzg73 |
|
|
π https://youtube.com/@NZG73 |
|
|
|
|
|
|
|
|
## Contact Email πππ |
|
|
E-mail:- |
|
|
nzgnzg73@gmail.com |
|
|
|
|
|
|
|
|
llama Cpp (cp\gpu) |
|
|
Old Version llama cpp ππππ |
|
|
llama-b7200-bin-win-cpu-x64.zip |
|
|
|
|
|
|
|
|
NEW Update Version llama cpp ππππ |
|
|
|
|
|
llama-b7541-bin-win-cpu-x64 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
## Llama Model Switcher |
|
|
|
|
|
CMD |
|
|
model_switcher.py |
|
|
|
|
|
pip install flask |
|
|
|
|
|
pip install flask psutil GPUtil |
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
## Image-Text-to-Text Models |
|
|
|
|
|
## Gemma-3 |
|
|
|
|
|
CPU. RAM 20GB |
|
|
OR |
|
|
GPU. 4 VRAM |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
1. gemma-3-12b-it-Q4_K_S.gguf |
|
|
2. mmproj-model-f16-12B.gguf |
|
|
|
|
|
|
|
|
## -Text-to-Text Models |
|
|
## GPT OSS 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Qwen3 |
|
|
|
|
|
|
|
|
CPU. RAM 25GB |
|
|
OR |
|
|
GPU. 4 VRAM |
|
|
|
|
|
|
|
|
1. Qwen3-VL-2B-Instruct-Q8_0.gguf |
|
|
2. mmproj-Qwen3-VL-2B-Instruct-Q8_0.gguf |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
## Qwen2.5-Omni |
|
|
|
|
|
|
|
|
CPU. RAM 40GB |
|
|
OR |
|
|
GPU. 8 VRAM |
|
|
|
|
|
1. Qwen2.5-Omni-7B-BF16.gguf |
|
|
2. mmproj-F16.gguf 2GB |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Audio-Text-to-Text |
|
|
|
|
|
|
|
|
## Llama-3.2 |
|
|
|
|
|
|
|
|
CPU. RAM 10GB |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
1. Llama-3.2-1B-Instruct-Q4_K_M.gguf |
|
|
2. Llama-3.2-1B-Instruct-Q8_0.gguf |
|
|
3. mmproj-ultravox-v0_5-llama-3_2-1b-f16.gguf |
|
|
|
|
|
|
|
|
|
|
|
## run.bat |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Server |
|
|
|
|
|
llama-server.exe --n-gpu-layers 2 --ctx-size 111192 -m ".\models\mistralai\mistralai_Voxtral-Mini-3B-2507-Q8_0.gguf" --mmproj ".\models\mistralai\mmproj-mistralai_Voxtral-Mini-3B-2507-bf16.gguf" --host 0.0.0.0Β --portΒ 8005 |
|
|
|
|
|
|
|
|
public URL |
|
|
|
|
|
llama-server --n-gpu-layers 15 --ctx-size 8192 -m models/ollma/Llama-3.2-1B-Instruct-Q8_0.gguf --mmproj models/ollma/mmproj-ultravox-v0_5-llama-3_2-1b-f16.gguf --host 127.0.0.1Β --portΒ 8083 |