GGUF
How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="akx/Poro-34B-gguf",
	filename="",
)
output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Poro-34B-gguf

This is a GGUF quantization of the Poro-34B model.

Please refer to that repository's model card for details.

The current revision is a quantization of the 1000B token checkpoint.

The conversion was done with llama.cpp version b2354 (e25fb4b18fcedb9bed6be4585cf842e9a669b28b) on a Google Compute machine generously sponsored by Valohai.

Downloads last month
28
GGUF
Model size
35B params
Architecture
bloom
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support