How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="ggml-org/Nomic-Embed-Text-V2-GGUF",
	filename="nomic-embed-text-v2-moe-q8_0.gguf",
)
output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

ggml-org/Nomic-Embed-Text-V2-GGUF

This model was converted to GGUF format from nomic-ai/nomic-embed-text-v2-moe using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Use with llama.cpp

See https://github.com/ggml-org/llama.cpp/pull/12466#pullrequestreview-2799984859

Downloads last month
1,843
GGUF
Model size
0.5B params
Architecture
nomic-bert-moe
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ggml-org/Nomic-Embed-Text-V2-GGUF