How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Andy-ML-And-AI/SocratesAI-GGUF:Q4_K_M
Quick Links

SocratesAI (GGUF Edition)

The Digital Gadfly | High-Precision Philosophical Reasoning | Mobile-Ready

"I cannot teach anybody anything. I can only make them think." โ€” SocratesAI

Project Overview

SocratesAI is a fine-tuned version of Mistral-7B-v0.3, engineered to move beyond standard LLM "assistant" behavior. Instead of providing direct answers, this model utilizes the Socratic Method to challenge assumptions, expose contradictions, and guide the user toward their own insights.

This repository contains the GGUF version, optimized for local execution on laptops, smartphones, and edge devices.

GGUF Specifications

  • Format: GGUF (Llama.cpp compatible)
  • Quantization: Q4_K_M (Optimal balance of intelligence and size)
  • Base Architecture: Mistral-7B-v0.3
  • Optimized via: Unsloth (2x faster inference, lower VRAM footprint)

Persona & Capabilities

Unlike standard chatbots, SocratesAI is:

  1. Inquisitive: It asks more questions than it answers.
  2. Ironical: It uses gentle irony to highlight logical fallacies.
  3. Persistent: It encourages deep critical thinking rather than shallow consensus.

Local Execution

To run Socrates locally using llama.cpp:

./main -m SocratesAI-Q4_K_M.gguf -n 512 --prompt "User: What is justice? \nSocratesAI:"
Downloads last month
25
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Andy-ML-And-AI/SocratesAI-GGUF

Quantized
(85)
this model