rawcell
/

bruno-swarm-models

+---
+license: apache-2.0
+tags:
+  - abliterated
+  - gguf
+  - ollama
+  - crewai
+  - multi-agent
+  - qwen2.5-coder
+base_model:
+  - Qwen/Qwen2.5-Coder-14B-Instruct
+  - Qwen/Qwen2.5-Coder-3B-Instruct
+---
+# Bruno Swarm Models
+7 abliterated Qwen2.5-Coder models for multi-agent software development using [CrewAI](https://github.com/crewai/crewai) + [Ollama](https://ollama.com).
+Created with [Bruno](https://github.com/rawcell/heretic) - neural behavior modification via contrastive activation analysis and orthogonalization.
+## Models
+| Model | Base | Size | Role |
+|-------|------|------|------|
+| `orchestrator-14b-f16.gguf` | Qwen2.5-Coder-14B-Instruct | 28 GB | Senior Architect / Project Manager |
+| `frontend-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | React / TypeScript / Tailwind |
+| `backend-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | FastAPI / PostgreSQL / async |
+| `test-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | pytest / coverage / edge cases |
+| `security-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | OWASP / vulnerability assessment |
+| `docs-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | API docs / README / guides |
+| `devops-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | Docker / CI-CD / IaC |
+Total: ~63 GB (all F16 precision GGUF)
+## Abliteration Details
+Each model was independently abliterated using Bruno to reduce refusal behavior while preserving coding capabilities. The 6 specialists share the same base model (Qwen2.5-Coder-3B-Instruct) but have different abliteration weights from separate optimization runs.
+**Orchestrator (14B)**:
+- KL divergence: 0.47 (from base)
+- Refusal reduction: 63/67 prompts answered (6% reduction)
+- Optuna trials: 50
+**Specialists (3B)**:
+- Each independently optimized for their domain
+- All retain full coding capability
+## Quick Start
+### 1. Download models and Modelfiles
+```bash
+# Install git-lfs
+git lfs install
+# Clone (63 GB download)
+git clone https://huggingface.co/rawcell/bruno-swarm-models
+cd bruno-swarm-models
+```
+### 2. Import into Ollama
+Update the `FROM` paths in each Modelfile to point to your local GGUF files, then:
+```bash
+# Import each model
+ollama create orchestrator -f modelfiles/Modelfile.orchestrator
+ollama create frontend -f modelfiles/Modelfile.frontend
+ollama create backend -f modelfiles/Modelfile.backend
+ollama create test -f modelfiles/Modelfile.test
+ollama create security -f modelfiles/Modelfile.security
+ollama create docs -f modelfiles/Modelfile.docs
+ollama create devops -f modelfiles/Modelfile.devops
+```
+### 3. Run with bruno-swarm CLI
+```bash
+pip install bruno-ai[swarm]
+bruno-swarm run --task "Build a REST API with authentication"
+```
+Or use flat mode to select specific specialists:
+```bash
+bruno-swarm run --task "Write unit tests for auth module" --flat --agents test,security
+```
+## Ollama Configuration
+For multi-model operation, set these environment variables before starting Ollama:
+```bash
+export OLLAMA_MAX_LOADED_MODELS=3
+export OLLAMA_KEEP_ALIVE=30m
+```
+## Hardware Requirements
+- **Full swarm (hierarchical)**: 40+ GB VRAM (orchestrator 28GB + 1 specialist at a time)
+- **Specialists only (flat)**: 8+ GB VRAM (one 3B model at a time)
+- **All models loaded**: 63 GB VRAM (A100 80GB or similar)
+## Modelfiles
+The `modelfiles/` directory contains Ollama Modelfile configurations for each model with tuned parameters:
+- `num_ctx 8192` (required for CrewAI system prompts)
+- `num_predict 2048` for specialists, `4096` for orchestrator
+- `temperature 0.7`, `top_p 0.9`, `top_k 40`
+## License
+Apache 2.0 (same as base Qwen2.5-Coder models)