---
title: AIpster
emoji: 🧠
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false
---

# AIpster

**An independent think tank on artificial intelligence, society, and the future of thought.**

We're a collective of computer science friends from the late '90s who turned a WhatsApp group into a laboratory for exploring what AI is doing to how we work, build, and think.

🌐 [aipster.com](https://aipster.com)

---

## What we do here

This Hugging Face organization is where we publish the **artifacts** of our exploration — models, datasets, and tools that come out of the experiments we write about on our blog.

We're not a company. We don't sell anything. We build things to understand them, then share what we learned.

---

## Focus areas

- 🔬 **Small specialist models** — distillation, fine-tuning, and the art of making tiny models punch above their weight
- 🧭 **Prompt engineering & routing** — how prompts become infrastructure, not just text
- 🛠️  **Local LLM workflows** — what 96 GB of VRAM can (and can't) do
- 🤖 **Coding agents & automation** — how AI is reshaping software development from the inside out
- 📖 **AI & society** — the uncomfortable conversations the industry would rather skip

---

## What you'll find here

### Models

**[DevRouter-1.5B](https://huggingface.co/aipster/DevRouter-1.5B)** — our first release. A tiny prompt router that reads a raw developer prompt and returns a single JSON decision: a cleaned-up
rewrite, an `intent` / `complexity` classification, a suggested model-tier `route`, and the context the prompt forgot to include. Built on Qwen2.5-Coder-1.5B (Apache 2.0) and distilled from a
stronger teacher, it holds **~96% valid-JSON** and runs at **~280 tokens/s on a single RTX 3090** — small enough to sit in front of your real models and triage every prompt in 1–3 seconds.

- 🧠 [aipster/DevRouter-1.5B](https://huggingface.co/aipster/DevRouter-1.5B) — fp16 weights (transformers / vLLM)
- 📦 [aipster/DevRouter-1.5B-GGUF](https://huggingface.co/aipster/DevRouter-1.5B-GGUF) — Q8_0 + F16, plug-n-play with Ollama / llama.cpp

And one honest caveat, because we ship those too: **Q6 and below quantizations break its JSON.** A small model doing strict structured output is far more fragile than the "Q4 is fine" rule of thumb
 suggests — ship Q8_0 or F16.

### Datasets
*Coming soon* — curated and synthetic datasets from our distillation experiments, released alongside the models that use them.

### Spaces
*Coming soon* — interactive demos of our experiments.

---

## Read our work

- 📝 [Blog](https://aipster.com)
- 🧪 [How we built our distillation pipeline](https://aipster.com) *(coming soon)*
- 🔍 [Four GPUs, Two Weeks, and the Uncomfortable Truth About Local LLMs](https://aipster.com/four-gpus-two-weeks-and-the-uncomfortable-truth-about-local-llms/)
- 🤖 [I Stopped Learning n8n. I Just Told My Coding Agent What I Wanted](https://aipster.com/i-stopped-learning-n8n-i-just-told-my-coding-agent-what-i-wanted/)
- 💸 [Two Hours to Mass Extinction: What Coding Agents Mean for the Open-Core Business
Model](https://aipster.com/two-hours-to-mass-extinction-what-coding-agents-mean-for-the-open-core-business-model/)

---

## Philosophy

> We build to understand. We share to learn together.

Everything we publish here is open. Code, weights, datasets, methodology — including the failures. Especially the failures.

---

## Get in touch

- 🌐 Website: [aipster.com](https://aipster.com)
- 📬 Email: contact@aipster.com

---

*Independent. Curious. Slightly skeptical.*