File size: 8,924 Bytes
b03a8a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
# Stack 2.9 β€” 5-Minute Quick Start

> **Goal:** Get Stack 2.9 running and solving coding tasks in under 5 minutes.

Stack 2.9 is an AI coding assistant powered by **Qwen2.5-Coder-32B** with Pattern Memory β€” it learns from your interactions and improves over time.

---

## πŸ“‹ Prerequisites

### Required
| Requirement | Version | Check |
|-------------|---------|-------|
| Python | 3.10+ | `python3 --version` |
| Git | Any recent | `git --version` |
| pip | Latest | `pip --version` |

### Optional (Recommended)
| Resource | Why You Need It | Minimum |
|----------|----------------|---------|
| **GPU** | Fast code generation | RTX 3070 / M1 Pro |
| **16GB VRAM** | Run 32B model smoothly | 8GB for 7B quantized |

> **No GPU?** Stack 2.9 works on CPU via Ollama or cloud providers (OpenAI, Together AI, etc.).

---

## ⚑ Step 1 β€” Install in 60 Seconds

```bash
# 1. Clone the repository
git clone https://github.com/my-ai-stack/stack-2.9.git
cd stack-2.9

# 2. Create a virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate    # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# 4. Copy environment template
cp .env.example .env
```

**That's it.** If you hit errors, see [Troubleshooting](#-troubleshooting) below.

---

## πŸ”‘ Step 2 β€” Configure Your Model Provider

Stack 2.9 supports multiple LLM providers. **Pick one that matches your setup:**

### Option A: Ollama (Recommended β€” Local, Private)

```bash
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the Qwen model
ollama pull qwen2.5-coder:32b

# Set environment
export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=qwen2.5-coder:32b
```

Edit your `.env` file:
```env
MODEL_PROVIDER=ollama
OLLAMA_MODEL=qwen2.5-coder:32b
```

### Option B: Together AI (Best for Qwen, Cloud)

```bash
# Get your API key at https://together.ai
export TOGETHER_API_KEY=tog-your-key-here
```

Edit your `.env`:
```env
MODEL_PROVIDER=together
TOGETHER_API_KEY=tog-your-key-here
TOGETHER_MODEL=togethercomputer/qwen2.5-32b-instruct
```

### Option C: OpenAI (GPT-4o)

```env
MODEL_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here
OPENAI_MODEL=gpt-4o
```

### Option D: Anthropic (Claude)

```env
MODEL_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
ANTHROPIC_MODEL=claude-3-5-sonnet-20240229
```

### Option E: OpenRouter (Unified Access)

```env
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-your-key-here
OPENROUTER_MODEL=openai/gpt-4o
```

---

## πŸš€ Step 3 β€” Run Your First Task

### Interactive Chat Mode

```bash
python stack.py
```

You'll see:
```
╔══════════════════════════════════════════════╗
β•‘         Stack 2.9 β€” AI Coding Assistant     β•‘
β•‘  Pattern Memory: Active | Tools: 46          β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

You: Write a Python function to reverse a string
```

### Single Query Mode

```bash
python stack.py -c "Write a Python function to reverse a string"
```

**Expected output:**
```python
def reverse_string(s):
    """Reverse a string and return it."""
    return s[::-1]

# Or for a more robust version:
def reverse_string(s):
    return ''.join(reversed(s))
```

### Ask About Your Codebase

```bash
python stack.py -c "Find all Python files modified in the last week and list them"
```

### Generate and Run Code

```bash
python stack.py -c "Create a hello world Flask app with one route"
```

---

## πŸ“Š Step 4 β€” Run Evaluation (Optional)

> **Note:** Evaluation requires a GPU with ~16GB VRAM or more.

### Prepare Your Fine-Tuned Model

After training Stack 2.9 on your data, your merged model will be in:
```
./output/merged/
```

### Run HumanEval Benchmark

```bash
python evaluate_model.py \
    --model-path ./output/merged \
    --benchmark humaneval \
    --num-samples 10 \
    --output results.json
```

### Run MBPP Benchmark

```bash
python evaluate_model.py \
    --model-path ./output/merged \
    --benchmark mbpp \
    --num-samples 10 \
    --output results.json
```

### Run Both Benchmarks

```bash
python evaluate_model.py \
    --model-path ./output/merged \
    --benchmark both \
    --num-samples 10 \
    --k-values 1,10 \
    --output results.json
```

**Expected output format:**
```
============================================================
HumanEval Results
============================================================
  pass@1: 65.00%
  pass@10: 82.00%
  Total problems evaluated: 12
============================================================

============================================================
MBPP Results
============================================================
  pass@1: 70.00%
  pass@10: 85.00%
  Total problems evaluated: 12
============================================================
```

### Quick Evaluation (5 Problems Only)

```bash
python evaluate_model.py \
    --model-path ./output/merged \
    --benchmark humaneval \
    --num-problems 5 \
    --num-samples 5
```

---

## 🐳 Step 5 β€” Deploy Stack 2.9

### Deploy Locally with Docker

```bash
# Start the container
docker build -t stack-2.9 .
docker run -p 7860:7860 \
    -e MODEL_PROVIDER=ollama \
    -e OLLAMA_MODEL=qwen2.5-coder:32b \
    stack-2.9
```

Access at: **http://localhost:7860**

### Deploy to RunPod (Cloud GPU)

```bash
# Edit runpod_deploy.sh with your config first
bash runpod_deploy.sh --gpu a100 --instance hourly
```

### Deploy to Kubernetes

```bash
# 1. Edit k8s/secret.yaml with your HuggingFace token
# 2. Apply the manifests
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/pvc.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

# Check status
kubectl get pods -n stack-29
kubectl logs -n stack-29 deployment/stack-29
```

### Hardware Requirements for Deployment

| Model Size | Minimum GPU | Recommended | Quantized (4-bit) |
|------------|-------------|-------------|-------------------|
| 7B | RTX 3070 (8GB) | A100 40GB | RTX 3060 (6GB) |
| 32B | A100 40GB | A100 80GB | RTX 3090 (24GB) |

---

## 🧠 Pattern Memory Quick Guide

Stack 2.9 stores successful patterns to help with future tasks.

### List Your Patterns

```bash
python stack.py --patterns list
python stack.py --patterns stats
```

### Extract Patterns from Your Git History

```bash
python scripts/extract_patterns_from_git.py \
    --repo-path . \
    --output patterns.jsonl \
    --since-date "2024-01-01"
```

### Merge LoRA Adapters (Team Sharing)

```bash
python scripts/merge_lora_adapters.py \
    --adapters adapter_a.safetensors adapter_b.safetensors \
    --weights 0.7 0.3 \
    --output merged.safetensors
```

---

## πŸ› οΈ Troubleshooting

### "Module not found" errors

```bash
pip install -r requirements.txt
```

### "CUDA out of memory" during evaluation

```bash
# Reduce batch size
python evaluate_model.py --model-path ./merged --num-samples 5

# Or use 4-bit quantization
# (See docs/TRAINING_7B.md for quantized training)
```

### "Model not found" with Ollama

```bash
ollama pull qwen2.5-coder:32b
ollama list   # Verify it's installed
```

### "API key not set" errors

```bash
# Double-check your .env file
cat .env

# For testing, you can also set inline
export TOGETHER_API_KEY=tog-your-key
```

### Slow inference on CPU

```bash
# Use a smaller model
export OLLAMA_MODEL=qwen2.5-coder:7b

# Or switch to cloud
export MODEL_PROVIDER=together
```

### Docker build fails

```bash
# Use Python 3.10 explicitly
docker build --build-arg PYTHON_VERSION=3.10 -t stack-2.9 .
```

### Kubernetes GPU not found

```bash
# Verify nvidia.com/gpu label on your node
kubectl get nodes -L nvidia.com/gpu

# Install NVIDIA GPU Operator if missing
# https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/
```

---

## πŸ“š What's Next?

| Goal | Go To |
|------|-------|
| Train on my own data | `docs/TRAINING_7B.md` |
| Learn all 46 tools | `TOOLS.md` |
| Set up team pattern sharing | `docs/pattern-moat.md` |
| Understand the architecture | `docs/reference/ARCHITECTURE.md` |
| Report a bug | `SECURITY.md` / GitHub Issues |

---

## ⚑ Quick Reference Card

```bash
# Install
git clone https://github.com/my-ai-stack/stack-2.9.git
cd stack-2.9 && pip install -r requirements.txt

# Configure
cp .env.example .env   # Edit with your API keys

# Run
python stack.py                              # Interactive
python stack.py -c "your code request"        # Single query

# Evaluate
python evaluate_model.py --model-path ./merged --benchmark humaneval

# Deploy
docker build -t stack-2.9 . && docker run -p 7860:7860 stack-2.9
```

---

*Stack 2.9 β€” AI that learns your patterns and grows with you.*