rawcell commited on
Commit
86f9939
·
verified ·
1 Parent(s): be744dc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - abliterated
5
+ - gguf
6
+ - ollama
7
+ - crewai
8
+ - multi-agent
9
+ - qwen2.5-coder
10
+ base_model:
11
+ - Qwen/Qwen2.5-Coder-14B-Instruct
12
+ - Qwen/Qwen2.5-Coder-3B-Instruct
13
+ ---
14
+
15
+ # Bruno Swarm Models
16
+
17
+ 7 abliterated Qwen2.5-Coder models for multi-agent software development using [CrewAI](https://github.com/crewai/crewai) + [Ollama](https://ollama.com).
18
+
19
+ Created with [Bruno](https://github.com/rawcell/heretic) - neural behavior modification via contrastive activation analysis and orthogonalization.
20
+
21
+ ## Models
22
+
23
+ | Model | Base | Size | Role |
24
+ |-------|------|------|------|
25
+ | `orchestrator-14b-f16.gguf` | Qwen2.5-Coder-14B-Instruct | 28 GB | Senior Architect / Project Manager |
26
+ | `frontend-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | React / TypeScript / Tailwind |
27
+ | `backend-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | FastAPI / PostgreSQL / async |
28
+ | `test-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | pytest / coverage / edge cases |
29
+ | `security-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | OWASP / vulnerability assessment |
30
+ | `docs-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | API docs / README / guides |
31
+ | `devops-3b-f16.gguf` | Qwen2.5-Coder-3B-Instruct | 5.8 GB | Docker / CI-CD / IaC |
32
+
33
+ Total: ~63 GB (all F16 precision GGUF)
34
+
35
+ ## Abliteration Details
36
+
37
+ Each model was independently abliterated using Bruno to reduce refusal behavior while preserving coding capabilities. The 6 specialists share the same base model (Qwen2.5-Coder-3B-Instruct) but have different abliteration weights from separate optimization runs.
38
+
39
+ **Orchestrator (14B)**:
40
+ - KL divergence: 0.47 (from base)
41
+ - Refusal reduction: 63/67 prompts answered (6% reduction)
42
+ - Optuna trials: 50
43
+
44
+ **Specialists (3B)**:
45
+ - Each independently optimized for their domain
46
+ - All retain full coding capability
47
+
48
+ ## Quick Start
49
+
50
+ ### 1. Download models and Modelfiles
51
+
52
+ ```bash
53
+ # Install git-lfs
54
+ git lfs install
55
+
56
+ # Clone (63 GB download)
57
+ git clone https://huggingface.co/rawcell/bruno-swarm-models
58
+ cd bruno-swarm-models
59
+ ```
60
+
61
+ ### 2. Import into Ollama
62
+
63
+ Update the `FROM` paths in each Modelfile to point to your local GGUF files, then:
64
+
65
+ ```bash
66
+ # Import each model
67
+ ollama create orchestrator -f modelfiles/Modelfile.orchestrator
68
+ ollama create frontend -f modelfiles/Modelfile.frontend
69
+ ollama create backend -f modelfiles/Modelfile.backend
70
+ ollama create test -f modelfiles/Modelfile.test
71
+ ollama create security -f modelfiles/Modelfile.security
72
+ ollama create docs -f modelfiles/Modelfile.docs
73
+ ollama create devops -f modelfiles/Modelfile.devops
74
+ ```
75
+
76
+ ### 3. Run with bruno-swarm CLI
77
+
78
+ ```bash
79
+ pip install bruno-ai[swarm]
80
+ bruno-swarm run --task "Build a REST API with authentication"
81
+ ```
82
+
83
+ Or use flat mode to select specific specialists:
84
+
85
+ ```bash
86
+ bruno-swarm run --task "Write unit tests for auth module" --flat --agents test,security
87
+ ```
88
+
89
+ ## Ollama Configuration
90
+
91
+ For multi-model operation, set these environment variables before starting Ollama:
92
+
93
+ ```bash
94
+ export OLLAMA_MAX_LOADED_MODELS=3
95
+ export OLLAMA_KEEP_ALIVE=30m
96
+ ```
97
+
98
+ ## Hardware Requirements
99
+
100
+ - **Full swarm (hierarchical)**: 40+ GB VRAM (orchestrator 28GB + 1 specialist at a time)
101
+ - **Specialists only (flat)**: 8+ GB VRAM (one 3B model at a time)
102
+ - **All models loaded**: 63 GB VRAM (A100 80GB or similar)
103
+
104
+ ## Modelfiles
105
+
106
+ The `modelfiles/` directory contains Ollama Modelfile configurations for each model with tuned parameters:
107
+ - `num_ctx 8192` (required for CrewAI system prompts)
108
+ - `num_predict 2048` for specialists, `4096` for orchestrator
109
+ - `temperature 0.7`, `top_p 0.9`, `top_k 40`
110
+
111
+ ## License
112
+
113
+ Apache 2.0 (same as base Qwen2.5-Coder models)