Instructions to use tensorblock/saiga-7b-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use tensorblock/saiga-7b-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="tensorblock/saiga-7b-GGUF", filename="saiga-7b-Q2_K.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use tensorblock/saiga-7b-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/saiga-7b-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/saiga-7b-GGUF:Q2_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/saiga-7b-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/saiga-7b-GGUF:Q2_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf tensorblock/saiga-7b-GGUF:Q2_K # Run inference directly in the terminal: ./llama-cli -hf tensorblock/saiga-7b-GGUF:Q2_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf tensorblock/saiga-7b-GGUF:Q2_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf tensorblock/saiga-7b-GGUF:Q2_K
Use Docker
docker model run hf.co/tensorblock/saiga-7b-GGUF:Q2_K
- LM Studio
- Jan
- Ollama
How to use tensorblock/saiga-7b-GGUF with Ollama:
ollama run hf.co/tensorblock/saiga-7b-GGUF:Q2_K
- Unsloth Studio
How to use tensorblock/saiga-7b-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/saiga-7b-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/saiga-7b-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for tensorblock/saiga-7b-GGUF to start chatting
- Atomic Chat new
- Docker Model Runner
How to use tensorblock/saiga-7b-GGUF with Docker Model Runner:
docker model run hf.co/tensorblock/saiga-7b-GGUF:Q2_K
- Lemonade
How to use tensorblock/saiga-7b-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull tensorblock/saiga-7b-GGUF:Q2_K
Run and chat with the model
lemonade run user.saiga-7b-GGUF-Q2_K
List all available models
lemonade list
Keep Q2_K/Q3_K_M gguf only
Browse files- saiga-7b-Q3_K_L.gguf +0 -3
- saiga-7b-Q3_K_S.gguf +0 -3
- saiga-7b-Q4_0.gguf +0 -3
- saiga-7b-Q4_K_M.gguf +0 -3
- saiga-7b-Q4_K_S.gguf +0 -3
- saiga-7b-Q5_0.gguf +0 -3
- saiga-7b-Q5_K_M.gguf +0 -3
- saiga-7b-Q5_K_S.gguf +0 -3
- saiga-7b-Q6_K.gguf +0 -3
- saiga-7b-Q8_0.gguf +0 -3
saiga-7b-Q3_K_L.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:eb1f72eb934b78b31ffc5665ad4e6b53bb86d7fd4244e903d9763e5c69032c67
|
| 3 |
-
size 3822025344
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q3_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:1441e4c2943bb3f8e43b93f5cef96306a3d55b5d733e65b8853a74942df02bfe
|
| 3 |
-
size 3164568192
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q4_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:e44ae023b3d3232eab656c0d15011a52319c98be61190f5c8e138ea6e3f68c34
|
| 3 |
-
size 4108917376
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q4_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:dfb9a948e0dabf047f362b9bb08464b7801758743bdbd7cd74f39d44f85db3e4
|
| 3 |
-
size 4368439936
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q4_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:ca39f17754877ecddf72330e2b7edd3e1ddea06b86373405743e8f66f7d29656
|
| 3 |
-
size 4140374656
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q5_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:f4af8f0e83b0c9ae1c5ab3593c86769fc2e62b15efbddc72bdc533055e98a82a
|
| 3 |
-
size 4997716608
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q5_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:7c747bfe566676d00cb891feed9e1c2038c14d20383a8cd75f4e29b6325cde7b
|
| 3 |
-
size 5131410048
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q5_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:b989b280de9458c433e8800bad9d540d464d9a4d0ad44709ddf80be23ecfa6a5
|
| 3 |
-
size 4997716608
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q6_K.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:7794349dc70ca6c1b01d920315f281a95d6e3d63a51d46512c6cb13692b38aa4
|
| 3 |
-
size 5942065792
|
|
|
|
|
|
|
|
|
|
|
|
saiga-7b-Q8_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:1612a76e42c6fa8213fe5bb770a06be5e97991d58b217504bca2b1f1bd5eb836
|
| 3 |
-
size 7695858304
|
|
|
|
|
|
|
|
|
|
|
|