Instructions to use bluevoid-pl/zeta2-GUFF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bluevoid-pl/zeta2-GUFF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("bluevoid-pl/zeta2-GUFF", dtype="auto") - llama-cpp-python
How to use bluevoid-pl/zeta2-GUFF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="bluevoid-pl/zeta2-GUFF", filename="zeta2-F16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use bluevoid-pl/zeta2-GUFF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf bluevoid-pl/zeta2-GUFF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf bluevoid-pl/zeta2-GUFF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf bluevoid-pl/zeta2-GUFF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf bluevoid-pl/zeta2-GUFF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf bluevoid-pl/zeta2-GUFF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf bluevoid-pl/zeta2-GUFF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf bluevoid-pl/zeta2-GUFF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf bluevoid-pl/zeta2-GUFF:Q4_K_M
Use Docker
docker model run hf.co/bluevoid-pl/zeta2-GUFF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use bluevoid-pl/zeta2-GUFF with Ollama:
ollama run hf.co/bluevoid-pl/zeta2-GUFF:Q4_K_M
- Unsloth Studio new
How to use bluevoid-pl/zeta2-GUFF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bluevoid-pl/zeta2-GUFF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bluevoid-pl/zeta2-GUFF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for bluevoid-pl/zeta2-GUFF to start chatting
- Docker Model Runner
How to use bluevoid-pl/zeta2-GUFF with Docker Model Runner:
docker model run hf.co/bluevoid-pl/zeta2-GUFF:Q4_K_M
- Lemonade
How to use bluevoid-pl/zeta2-GUFF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull bluevoid-pl/zeta2-GUFF:Q4_K_M
Run and chat with the model
lemonade run user.zeta2-GUFF-Q4_K_M
List all available models
lemonade list
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)Zeta 2 GUFF
This is direct GUFF of zed-industries/zeta-2.
Quantizations prefixed with I in this repo do not use an "importance matrix", so the quality of those models might be limited.
Zeta 2 is a code edit prediction (also known as next-edit suggestion) model finetuned from ByteDance-Seed/Seed-Coder-8B-Base.
Given code context, edits history and an editable region around the cursor, it predicts the rewritten content for that region.
Zed Editor + Llama.cpp
This guide assumes that you will use GPU and it has enough vram to load model in full.
I wasn’t able to get significantly better predictions from this model compared with the previous Zeta model, so quality may vary.
- install llama.cpp (preferably with GPU acceleration)
- Download model manually (optionally you can use
-hfoption in later commands to load model from HuggingFace) - run model to check if it works:
llama-cli -m model.guff - start Llama.cpp server:
llama-server -m zeta2-Q4_K_M.gguf --port 13377 --ctx-size 4096 --jinja -ngl 100 --host 0.0.0.0 --api-key "APIKEY"
| Attribute | Explenation |
|---|---|
| -m zeta2-Q4_K_M.gguf | Loads the model from file |
| --port 13377 | Makes the server listen on port 13377 instead of the default 8080. |
| --ctx-size 4096 | Sets the context window size to 4096 tokens. |
| --jinja | Use embeded jijna template insted of default |
| -ngl 100 | Offloads up to 100 layers to the GPU, if supported. |
| --host 0.0.0.0 | Binds the server to all network interfaces, so it can accept connections from other machines on your network, not just localhost. |
| --api-key "APIKEY" | Zed requires some key to be set |
- Open Zed Editor Settings(GUI), and choose AI. Under Edit Predictions, Click Configure.
- Scroll down to section OpenAI compatible api
- Set api key to
APIKEY!!! Press Enter !!!, this step is not optional even if you use only localhost( at the time of writing of this guide ) - Set api url to
http://localhost:13378/v1/completions!!! Press Enter !!! - Set model to
zeta2-Q4_K_M.gguf!!! Press Enter !!! - (optional) set max output tokens to 256
- scroll up
- Set Provider to OpenAI cpmatible api
- Restart zed
- Completions should work now. (quality may vary)
Zed Editor + Ollama mini-guide Ollama support seams to not be the beast at the moment I recomend using llama.cpp
Pull model ( This example will use qwant Q4_K_M, you can use diffrent qwant if you want )
ollama pull hf.co/bluevoid-pl/zeta2-GUFF:Q4_K_MConfigure Zed Editor
- Open Settings(GUI), and choose AI. Under Edit Predictions, Click Configure.
- Scroll down
- Confirm the host URL is:
http://localhost:11434<-- If you changed default this need to be modified - Set Model to
bluevoid-pl/zeta2-GUFF:Q4_K_M<-- Same qwant as before - Scroll to top
- Set Provider to Ollama
If you have any proposals/recommendations leave them in community discussions.
Info
- Developed by: Zed Industries
- License: Apache-2.0
- Fine-tuned from: ByteDance-Seed/Seed-Coder-8B-Base
- Model version: 0225-s3-seed
- Downloads last month
- 383
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for bluevoid-pl/zeta2-GUFF
Base model
ByteDance-Seed/Seed-Coder-8B-Base
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="bluevoid-pl/zeta2-GUFF", filename="", )