How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
# Run inference directly in the terminal:
llama-cli -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
# Run inference directly in the terminal:
llama-cli -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
Use Docker
docker model run hf.co/MigsN9/SmolLM2-360M-Instruct-Mem-Cat:Q8_0
Quick Links

Hermie Assistant - Memory Router

Sys prompt:

"\n{"fact":str|null,"retrieve":bool,"tool":bool,"emotion":str}\nfact: durable personal fact, compressed. null if nothing to store."

Output format:

{"fact":str|null,"retrieve":bool,"tool":bool,"emotion":str}

Tool:

Basic tool intent detection. For context for larger LLM.

Retrieve:

Binary gate that engages the embedding model for memory retrieval.

Emotion

Emotional response for larger LLM, persistent accross chats.

Fine-tuned from SmolLM2-360M-Instruct for personal assistant memory classification.

Downloads last month
29
Safetensors
Model size
0.4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MigsN9/SmolLM2-360M-Instruct-Mem-Cat

Quantized
(87)
this model