How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="MigsN9/SmolLM2-360M-Instruct-Mem-Cat",
	filename="smollm2-360m-mem-cat-q8_0.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Hermie Assistant - Memory Router

Sys prompt:

"\n{"fact":str|null,"retrieve":bool,"tool":bool,"emotion":str}\nfact: durable personal fact, compressed. null if nothing to store."

Output format:

{"fact":str|null,"retrieve":bool,"tool":bool,"emotion":str}

Tool:

Basic tool intent detection. For context for larger LLM.

Retrieve:

Binary gate that engages the embedding model for memory retrieval.

Emotion

Emotional response for larger LLM, persistent accross chats.

Fine-tuned from SmolLM2-360M-Instruct for personal assistant memory classification.

Downloads last month
88
Safetensors
Model size
0.4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MigsN9/SmolLM2-360M-Instruct-Mem-Cat

Quantized
(85)
this model