Instructions to use W4D/YugoGPT-7B-Instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use W4D/YugoGPT-7B-Instruct-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="W4D/YugoGPT-7B-Instruct-GGUF", filename="YugoGPT-7B-Instruct-F16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use W4D/YugoGPT-7B-Instruct-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
Use Docker
docker model run hf.co/W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use W4D/YugoGPT-7B-Instruct-GGUF with Ollama:
ollama run hf.co/W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
- Unsloth Studio new
How to use W4D/YugoGPT-7B-Instruct-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for W4D/YugoGPT-7B-Instruct-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for W4D/YugoGPT-7B-Instruct-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for W4D/YugoGPT-7B-Instruct-GGUF to start chatting
- Docker Model Runner
How to use W4D/YugoGPT-7B-Instruct-GGUF with Docker Model Runner:
docker model run hf.co/W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
- Lemonade
How to use W4D/YugoGPT-7B-Instruct-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull W4D/YugoGPT-7B-Instruct-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.YugoGPT-7B-Instruct-GGUF-Q4_K_M
List all available models
lemonade list
YugoGPT Instruct
YugoGPT Instruct is a fine-tuned version of the YugoGPT base model designed specifically for translation tasks involving Serbian, Croatian, and Bosnian languages. Unlike the base model, this instruct model is optimized for following user instructions, offering improved performance in instruction-based interactions.
Overview
YugoGPT Instruct builds upon the powerful capabilities of the YugoGPT base model, fine-tuning it to enhance its usability in structured and directive tasks. This model is ideal for translation workflows where accuracy and context preservation are critical.
Features
- Specialized for BCS Languages: Tailored for Serbian, Croatian, and Bosnian language translations.
- Instruction Following: Fine-tuned to better adhere to user-provided instructions.
- Flexible Deployment: Compatible with various quantization formats for different computational environments.
Quantization Formats
A variety of quantization formats are available to suit diverse performance and resource requirements. Below is the table of quantization options:
| Filename | Quant Type | Description |
|---|---|---|
YugoGPT-7B-Instruct-F16 |
F16 | Full F16 precision, maximum quality. |
YugoGPT-7B-Instruct-Q8_0 |
Q8_0 | Extremely high quality. |
YugoGPT-7B-Instruct-Q6_K |
Q6_K | Very high quality, near perfect, recommended. |
YugoGPT-7B-Instruct-Q5_K_M |
Q5_K_M | High quality, recommended. |
YugoGPT-7B-Instruct-Q5_K_S |
Q5_K_S | High quality with optimal trade-offs. |
YugoGPT-7B-Instruct-Q4_K_M |
Q4_K_M | Good quality, optimized for speed. |
YugoGPT-7B-Instruct-Q4_K_S |
Q4_K_S | Slightly lower quality with more savings. |
YugoGPT-7B-Instruct-Q3_K_L |
Q3_K_L | Lower quality, good for low RAM systems. |
YugoGPT-7B-Instruct-Q3_K_M |
Q3_K_M | Low quality, optimized for size. |
YugoGPT-7B-Instruct-Q3_K_S |
Q3_K_S | Low quality, not recommended. |
Usage
For usage with Ollama, you can initialize the model using the provided modelfile in the repository. Follow Ollama’s setup instructions to get started.
Replace {__FILE_LOCATION__} with the file name of the quant you want to use when creating the model using Ollama CLI.
Licensing
This model is released under the Apache 2.0 License, the same as the YugoGPT base repository.
Credits
- Base Model: YugoGPT by Aleksa Gordić
- Fine-Tuning Framework: Unsloth
- Downloads last month
- 74
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for W4D/YugoGPT-7B-Instruct-GGUF
Base model
gordicaleksa/YugoGPT