squ11z1
/

How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="squ11z1/Hypnos-Q1-GGUF",
	filename="",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Hypnos-Q1 — GGUF

GGUF quantizations of squ11z1/Hypnos-Q1.

File Quant Size
Hypnos-Q1.F16.gguf F16 8.4 GB
Hypnos-Q1.Q8_0.gguf Q8_0 4.5 GB
Hypnos-Q1.Q6_K.gguf Q6_K 3.5 GB
Hypnos-Q1.Q5_K_M.gguf Q5_K_M 3.1 GB
Hypnos-Q1.Q4_K_M.gguf Q4_K_M 2.7 GB

See the base model card for evaluation results and architecture details.

Downloads last month
23
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for squ11z1/Hypnos-Q1-GGUF

Finetuned
Qwen/Qwen3.5-4B
Quantized
(1)
this model