Instructions to use ghostai1/GHOSTAI-Spooky with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use ghostai1/GHOSTAI-Spooky with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ghostai1/GHOSTAI-Spooky", filename="ghostai-horror-7b.Q2_K.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use ghostai1/GHOSTAI-Spooky with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Use Docker
docker model run hf.co/ghostai1/GHOSTAI-Spooky:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use ghostai1/GHOSTAI-Spooky with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ghostai1/GHOSTAI-Spooky" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ghostai1/GHOSTAI-Spooky", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ghostai1/GHOSTAI-Spooky:Q4_K_M
- Ollama
How to use ghostai1/GHOSTAI-Spooky with Ollama:
ollama run hf.co/ghostai1/GHOSTAI-Spooky:Q4_K_M
- Unsloth Studio new
How to use ghostai1/GHOSTAI-Spooky with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ghostai1/GHOSTAI-Spooky to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ghostai1/GHOSTAI-Spooky to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ghostai1/GHOSTAI-Spooky to start chatting
- Pi new
How to use ghostai1/GHOSTAI-Spooky with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "ghostai1/GHOSTAI-Spooky:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use ghostai1/GHOSTAI-Spooky with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ghostai1/GHOSTAI-Spooky:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default ghostai1/GHOSTAI-Spooky:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use ghostai1/GHOSTAI-Spooky with Docker Model Runner:
docker model run hf.co/ghostai1/GHOSTAI-Spooky:Q4_K_M
- Lemonade
How to use ghostai1/GHOSTAI-Spooky with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ghostai1/GHOSTAI-Spooky:Q4_K_M
Run and chat with the model
lemonade run user.GHOSTAI-Spooky-Q4_K_M
List all available models
lemonade list
GHOSTAI โ HORROR GGUF (7B)
A focused, horror-themed 7B model released exclusively in quantized GGUF format for the llama.cpp ecosystem.
Quantized-only release. No FP16 weights included.
Overview
GHOSTAI is a compact, atmosphere-driven horror model designed for narrative generation, roleplay, and dark storytelling.
It prioritizes tone, pacing, and vivid imagery over generic assistant behavior.
This repository provides multiple GGUF quantizations, allowing you to choose the best balance of quality, speed, and memory usage for your hardware.
The model runs:
- Fully on CPU
- With optional GPU offload (CUDA / Metal / Vulkan builds of llama.cpp)
Quantization choice is independent of whether you use CPU or GPU.
Files
| File | Quant | Approx size | Rough RAM needed (4k ctx) |
|---|---|---|---|
ghostai-horror-7b.Q8_0.gguf |
Q8_0 | ~7.2 GB | ~10โ11 GB |
ghostai-horror-7b.Q6_K.gguf |
Q6_K | ~5.5 GB | ~8โ9 GB |
ghostai-horror-7b.Q5_K_M.gguf |
Q5_K_M | ~4.8 GB | ~7โ8 GB |
ghostai-horror-7b.Q5_K_S.gguf |
Q5_K_S | ~4.7 GB | ~7โ8 GB |
ghostai-horror-7b.Q4_K_M.gguf |
Q4_K_M | ~4.1 GB | ~6โ7 GB |
ghostai-horror-7b.Q4_K_S.gguf |
Q4_K_S | ~3.9 GB | ~6โ7 GB |
ghostai-horror-7b.Q3_K_M.gguf |
Q3_K_M | ~3.3 GB | ~5โ6 GB |
ghostai-horror-7b.Q3_K_S.gguf |
Q3_K_S | ~3.0 GB | ~5โ6 GB |
ghostai-horror-7b.Q2_K.gguf |
Q2_K | ~2.5 GB | ~4โ5 GB |
ghostai-horror-7b.TQ1_0.gguf |
TQ1_0 | ~1.6 GB | ~3โ4 GB |
Notes:
- โRough RAM neededโ assumes ~4k context and typical llama.cpp overhead.
- For 8k context, plan +1โ2 GB extra.
- GPU offload can shift some load to VRAM, but you still need system RAM.
Recommended Downloads
- Best default:
Q4_K_M - More quality (more RAM):
Q5_K_M,Q6_K,Q8_0 - Low RAM:
Q3_K_S,Q2_K - Ultra-small / experimental:
TQ1_0(expect noticeable quality loss)
Quickstart (llama.cpp)
1) Run on CPU
./llama-cli \
-m ghostai-horror-7b.Q4_K_M.gguf \
-c 4096 \
-t 8 \
-p "You are GHOSTAI. Speak like a calm horror narrator. Keep it tight and vivid."
- Downloads last month
- 89
1-bit
2-bit
3-bit
4-bit
5-bit
6-bit