How to use from
Pi
Start the llama.cpp server
# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf continuedev/instinct-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "continuedev/instinct-GGUF:Q4_K_M"
        }
      ]
    }
  }
}
Run Pi
# Start Pi in your project directory:
pi
Quick Links

Q4_K_M Quantization of Instinct, Continue's Open Next-Edit Model

This is a Q4_K_M quantized GGUF version of the original model for efficient local inference.

Downloads last month
185
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for continuedev/instinct-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(9)
this model