How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="QuantPasture/GLM-4.6-GGUF",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Currently a WIP, I have several quants that will be KLD tested and uploaded here over the next few days. The reference logits for the KLD testing have been provided for reproducing results or performing comparative analysis with other quants.

Downloads last month
49
GGUF
Model size
357B params
Architecture
glm4moe
Hardware compatibility
Log In to add your hardware

3-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support