ghostai1
/

GHOSTAI-Spooky

@@ -2,63 +2,74 @@
 license: mit
 ---
-<!--
-GHOSTAI • HORROR GGUF RELEASE README
-Drop this into README.md at the root of your Hugging Face repo.
--->
 <p align="center">
-  <img src="https://capsule-render.vercel.app/api?type=waving&color=0:0b0b0f,50:2b0a2a,100:0b0b0f&height=160&section=header&text=GHOSTAI%20%E2%80%94%20HORROR%20GGUF&fontSize=44&fontColor=EAEAEA&animation=twinkling" />
 </p>
 <p align="center">
   <img alt="GGUF" src="https://img.shields.io/badge/GGUF-llama.cpp-8A2BE2?style=for-the-badge">
-  <img alt="Base" src="https://img.shields.io/badge/Base-Mistral%207B%20Instruct%20v0.3-5B2C83?style=for-the-badge">
-  <img alt="Quant" src="https://img.shields.io/badge/Quant-Q4__K__M%20%7C%20F16-3A0CA3?style=for-the-badge">
   <img alt="Theme" src="https://img.shields.io/badge/Theme-Horror-8B0000?style=for-the-badge">
 </p>
 <p align="center">
-  <b>GHOSTAI</b> is a horror-flavored GGUF release (llama.cpp-ready) built from a LoRA fine-tune on <code>mistralai/Mistral-7B-Instruct-v0.3</code>.
-  <br/>
-  Pick your haunt: <b>F16</b> for max fidelity or <b>Q4_K_M</b> for the best everyday balance.
 </p>
 ---
 ## 🩸 What’s inside
-This repo contains **GGUF** files for fast local inference using **llama.cpp**-compatible runtimes.
-### 🎃 Spooky file set
-| Codename | File | Format | Use case |
-|---|---|---:|---|
-| **GHOSTAI_FOGF16** | `model.f16.gguf` | f16 | Maximum quality (largest) |
-| **GHOSTAI_CRYPT_Q4KM** | `model.Q4_K_M.gguf` | Q4_K_M | Best default (quality/size) |
-| **GHOSTAI_WHISPER_IQ1S** | `model.IQ1_S.gguf` | IQ1_S | Tiny build (quality drop) |
-| **GHOSTAI_RAGDOLL_Q2K** | `model.Q2_K.gguf` | Q2_K | Fallback if IQ1_S unsupported |
-> Not all files may exist in every release—this table lists the intended set. Use the “Files” panel to confirm what’s included.
 ---
-## 🧬 Base model
-- **Base**: `mistralai/Mistral-7B-Instruct-v0.3`
-- **Release type**: GGUF export (llama.cpp ecosystem)
-- **Training method**: LoRA fine-tune → merged → GGUF → quantized
 ---
 ## ⚰️ Quickstart (llama.cpp)
-### 1) Run on GPU (CUDA build)
 ```bash
 ./llama-cli \
-  -m model.Q4_K_M.gguf \
   -ngl 99 \
   -c 4096 \
-  -p "You are GHOSTAI. Speak like a calm narrator in a horror novel. Keep it concise."

 license: mit
 ---
+<!-- GHOSTAI • HORROR GGUF (7B) — README -->
 <p align="center">
+  <img src="https://capsule-render.vercel.app/api?type=waving&color=0:0b0b0f,50:2b0a2a,100:0b0b0f&height=160&section=header&text=GHOSTAI%20%E2%80%94%20HORROR%20GGUF%20(7B)&fontSize=42&fontColor=EAEAEA&animation=twinkling" />
 </p>
 <p align="center">
   <img alt="GGUF" src="https://img.shields.io/badge/GGUF-llama.cpp-8A2BE2?style=for-the-badge">
+  <img alt="7B" src="https://img.shields.io/badge/Size-7B-5B2C83?style=for-the-badge">
   <img alt="Theme" src="https://img.shields.io/badge/Theme-Horror-8B0000?style=for-the-badge">
+  <img alt="Quant" src="https://img.shields.io/badge/Quants-Q8__0%20%7C%20Q6__K%20%7C%20Q5__K__M%20%7C%20Q4__K__M-3A0CA3?style=for-the-badge">
 </p>
 <p align="center">
+  <b>GHOSTAI</b> is a <b>horror-themed</b> <b>7B</b> GGUF release for the <b>llama.cpp</b> ecosystem.<br/>
+  This repo contains <b>quantized GGUFs only</b> (no FP16).
 </p>
 ---
 ## 🩸 What’s inside
+Quantized GGUF files (7B) ready for llama.cpp-compatible runtimes.
+### 🎃 Files in this release
+| File | Quant | Approx size | Rough RAM needed (4k ctx) |
+|---|---:|---:|---:|
+| `ghostai-horror-7b.Q8_0.gguf` | Q8_0 | ~7.2 GB | ~10–11 GB |
+| `ghostai-horror-7b.Q6_K.gguf` | Q6_K | ~5.5 GB | ~8–9 GB |
+| `ghostai-horror-7b.Q5_K_M.gguf` | Q5_K_M | ~4.8 GB | ~7–8 GB |
+| `ghostai-horror-7b.Q5_K_S.gguf` | Q5_K_S | ~4.7 GB | ~7–8 GB |
+| `ghostai-horror-7b.Q4_K_M.gguf` | Q4_K_M | ~4.1 GB | ~6–7 GB |
+| `ghostai-horror-7b.Q4_K_S.gguf` | Q4_K_S | ~3.9 GB | ~6–7 GB |
+| `ghostai-horror-7b.Q3_K_M.gguf` | Q3_K_M | ~3.3 GB | ~5–6 GB |
+| `ghostai-horror-7b.Q3_K_S.gguf` | Q3_K_S | ~3.0 GB | ~5–6 GB |
+| `ghostai-horror-7b.Q2_K.gguf` | Q2_K | ~2.5 GB | ~4–5 GB |
+| `ghostai-horror-7b.TQ1_0.gguf` | TQ1_0 | ~1.6 GB | ~3–4 GB |
+**RAM notes (rough):**
+- “Rough RAM needed” assumes **~4k context** and typical llama.cpp overhead.
+- If you run **8k context**, add roughly **+1–2 GB**.
+- GPU offload doesn’t remove the need for RAM; it shifts some weight/KV usage to VRAM depending on settings.
 ---
+## 🧟 Which quant should I use?
+- **Best default:** `Q4_K_M`
+- **Higher quality:** `Q5_K_M` or `Q6_K`
+- **If you have plenty of RAM:** `Q8_0`
+- **Low RAM:** `Q3_K_S` / `Q2_K`
+- **Tiny / experimental:** `TQ1_0` (expect quality loss)
+These formats are **not “CPU vs GPU.”**
+You can run any quant on CPU-only or with GPU offload.
 ---
 ## ⚰️ Quickstart (llama.cpp)
+### GPU offload (CUDA build)
 ```bash
 ./llama-cli \
+  -m ghostai-horror-7b.Q4_K_M.gguf \
   -ngl 99 \
   -c 4096 \
+  -p "You are GHOSTAI. Speak like a calm horror narrator. Keep it tight and vivid."