Instructions to use MoxoffSrL/AzzurroQuantized with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MoxoffSrL/AzzurroQuantized with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MoxoffSrL/AzzurroQuantized")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MoxoffSrL/AzzurroQuantized")
model = AutoModelForCausalLM.from_pretrained("MoxoffSrL/AzzurroQuantized")

llama-cpp-python

How to use MoxoffSrL/AzzurroQuantized with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="MoxoffSrL/AzzurroQuantized",
	filename="Azzurro-ggml-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use MoxoffSrL/AzzurroQuantized with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoxoffSrL/AzzurroQuantized:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf MoxoffSrL/AzzurroQuantized:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoxoffSrL/AzzurroQuantized:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf MoxoffSrL/AzzurroQuantized:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MoxoffSrL/AzzurroQuantized:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf MoxoffSrL/AzzurroQuantized:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MoxoffSrL/AzzurroQuantized:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MoxoffSrL/AzzurroQuantized:Q4_K_M

Use Docker

docker model run hf.co/MoxoffSrL/AzzurroQuantized:Q4_K_M

LM Studio
Jan

vLLM

How to use MoxoffSrL/AzzurroQuantized with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MoxoffSrL/AzzurroQuantized"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MoxoffSrL/AzzurroQuantized",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MoxoffSrL/AzzurroQuantized:Q4_K_M

SGLang

How to use MoxoffSrL/AzzurroQuantized with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MoxoffSrL/AzzurroQuantized" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MoxoffSrL/AzzurroQuantized",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MoxoffSrL/AzzurroQuantized" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MoxoffSrL/AzzurroQuantized",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use MoxoffSrL/AzzurroQuantized with Ollama:
```
ollama run hf.co/MoxoffSrL/AzzurroQuantized:Q4_K_M
```

Unsloth Studio new

How to use MoxoffSrL/AzzurroQuantized with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoxoffSrL/AzzurroQuantized to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoxoffSrL/AzzurroQuantized to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for MoxoffSrL/AzzurroQuantized to start chatting

Docker Model Runner
How to use MoxoffSrL/AzzurroQuantized with Docker Model Runner:
```
docker model run hf.co/MoxoffSrL/AzzurroQuantized:Q4_K_M
```

Lemonade

How to use MoxoffSrL/AzzurroQuantized with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull MoxoffSrL/AzzurroQuantized:Q4_K_M

Run and chat with the model

lemonade run user.AzzurroQuantized-Q4_K_M

List all available models

lemonade list

AzzurroQuantized

Commit History

Update README.md

400e0f5
verified

marcodambra commited on Apr 9, 2024

Update README.md

cb0600b
verified

JacopoAbate commited on Apr 9, 2024

Update README.md

e11ca4e
verified

marcodambra commited on Apr 9, 2024

Update README.md

9e61e55
verified

JacopoAbate commited on Apr 9, 2024

Update README.md

a66273a
verified

JacopoAbate commited on Apr 8, 2024

Update README.md

a2ee4eb
verified

JacopoAbate commited on Apr 8, 2024

Rename xxxx-ggml-Q8_0.gguf to Azzurro-ggml-Q8_0.gguf

e6a6bb7
verified

JacopoAbate commited on Apr 8, 2024

Rename xxxx-ggml-Q4_K_M.gguf to Azzurro-ggml-Q4_K_M.gguf

71fab18
verified

JacopoAbate commited on Apr 8, 2024

Update README.md

b118fbe
verified

JacopoAbate commited on Apr 8, 2024

Update README.md

99c461a
verified

marcodambra commited on Apr 8, 2024

Update README.md

3db21aa
verified

marcodambra commited on Apr 8, 2024

Update README.md

680b9b3
verified

marcodambra commited on Apr 8, 2024

Update README.md

adc93ab
verified

marcodambra commited on Apr 8, 2024

Update README.md

fb2fbd7
verified

Moxoff commited on Apr 8, 2024

Update README.md

9720d1b
verified

Moxoff commited on Apr 8, 2024

Update README.md

1f09980
verified

JacopoAbate commited on Apr 8, 2024

Update README.md

f39ddc8
verified

JacopoAbate commited on Apr 5, 2024

Update README.md

b90a335
verified

marcodambra commited on Apr 5, 2024

Upload 2 files

2b2a2ed
verified

JacopoAbate commited on Apr 4, 2024

Delete xxxx-ggml-Q8_0.gguf

dfbac42
verified

JacopoAbate commited on Apr 4, 2024

Delete xxxx-ggml-Q4_K_M.gguf

d0452d2
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

7cec443
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

ba45ffb
verified

marcodambra commited on Apr 4, 2024

Update README.md

318249c
verified

marcodambra commited on Apr 4, 2024

Update README.md

fa7a860
verified

marcodambra commited on Apr 4, 2024

Update README.md

44a1115
verified

marcodambra commited on Apr 4, 2024

Update README.md

a35d929
verified

marcodambra commited on Apr 4, 2024

Update README.md

896c3eb
verified

marcodambra commited on Apr 4, 2024

Update README.md

4615aa9
verified

marcodambra commited on Apr 4, 2024

Update README.md

402f76b
verified

marcodambra commited on Apr 4, 2024

Update README.md

5c31029
verified

marcodambra commited on Apr 4, 2024

Update README.md

dc47378
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

2992bd0
verified

marcodambra commited on Apr 4, 2024

Update README.md

af83552
verified

JacopoAbate commited on Apr 4, 2024

Upload config.json

56d8852
verified

JacopoAbate commited on Apr 4, 2024

Upload 2 files

a3bf823
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

280a9b3
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

e760493
verified

JacopoAbate commited on Apr 4, 2024

Update README.md

4ec90e6
verified

JacopoAbate commited on Apr 4, 2024

initial commit

3b9cda8
verified

JacopoAbate commited on Apr 4, 2024

Commit History

Update README.md 400e0f5 verified

Update README.md cb0600b verified

Update README.md e11ca4e verified

Update README.md 9e61e55 verified

Update README.md a66273a verified

Update README.md a2ee4eb verified

Rename xxxx-ggml-Q8_0.gguf to Azzurro-ggml-Q8_0.gguf e6a6bb7 verified

Rename xxxx-ggml-Q4_K_M.gguf to Azzurro-ggml-Q4_K_M.gguf 71fab18 verified

Update README.md b118fbe verified

Update README.md 99c461a verified

Update README.md 3db21aa verified

Update README.md 680b9b3 verified

Update README.md adc93ab verified

Update README.md fb2fbd7 verified

Update README.md 9720d1b verified

Update README.md 1f09980 verified

Update README.md f39ddc8 verified

Update README.md b90a335 verified

Upload 2 files 2b2a2ed verified

Delete xxxx-ggml-Q8_0.gguf dfbac42 verified

Delete xxxx-ggml-Q4_K_M.gguf d0452d2 verified

Update README.md 7cec443 verified

Update README.md ba45ffb verified

Update README.md 318249c verified

Update README.md fa7a860 verified

Update README.md 44a1115 verified

Update README.md a35d929 verified

Update README.md 896c3eb verified

Update README.md 4615aa9 verified

Update README.md 402f76b verified

Update README.md 5c31029 verified

Update README.md dc47378 verified

Update README.md 2992bd0 verified

Update README.md af83552 verified

Upload config.json 56d8852 verified

Upload 2 files a3bf823 verified

Update README.md 280a9b3 verified

Update README.md e760493 verified

Update README.md 4ec90e6 verified

initial commit 3b9cda8 verified

Update README.md

400e0f5
verified

Update README.md

cb0600b
verified

Update README.md

e11ca4e
verified

Update README.md

9e61e55
verified

Update README.md

a66273a
verified

Update README.md

a2ee4eb
verified

Rename xxxx-ggml-Q8_0.gguf to Azzurro-ggml-Q8_0.gguf

e6a6bb7
verified

Rename xxxx-ggml-Q4_K_M.gguf to Azzurro-ggml-Q4_K_M.gguf

71fab18
verified

Update README.md

b118fbe
verified

Update README.md

99c461a
verified

Update README.md

3db21aa
verified

Update README.md

680b9b3
verified

Update README.md

adc93ab
verified

Update README.md

fb2fbd7
verified

Update README.md

9720d1b
verified

Update README.md

1f09980
verified

Update README.md

f39ddc8
verified

Update README.md

b90a335
verified

Upload 2 files

2b2a2ed
verified

Delete xxxx-ggml-Q8_0.gguf

dfbac42
verified

Delete xxxx-ggml-Q4_K_M.gguf

d0452d2
verified

Update README.md

7cec443
verified

Update README.md

ba45ffb
verified

Update README.md

318249c
verified

Update README.md

fa7a860
verified

Update README.md

44a1115
verified

Update README.md

a35d929
verified

Update README.md

896c3eb
verified

Update README.md

4615aa9
verified

Update README.md

402f76b
verified

Update README.md

5c31029
verified

Update README.md

dc47378
verified

Update README.md

2992bd0
verified

Update README.md

af83552
verified

Upload config.json

56d8852
verified

Upload 2 files

a3bf823
verified

Update README.md

280a9b3
verified

Update README.md

e760493
verified

Update README.md

4ec90e6
verified

initial commit

3b9cda8
verified