# BuildwellAI Model V2

Fine-tuned Qwen3-14B for UK construction industry applications.

## Features

- **42 MCP Server Integration**: Tool calling for all BuildwellAI calculation servers
- **Multi-Mode Responses**: Direct answers, thinking mode, and tool calling
- **Anti-Overfitting**: Early stopping, dropout, weight decay, validation monitoring
- **Streaming API**: OpenAI-compatible streaming inference server

## Quick Start

### 1. Prepare Dataset

```bash
cd /opt/buildwellai/buildwellai-llm-models/buildwellai-model-v2/scripts
python3 prepare_dataset.py
```

This will:
- Convert CSV files (BSI, UK Benchmark, Q&A pairs)
- Load existing JSONL datasets (thinking_mode, tool_calling)
- Generate MCP training data for all 42 servers
- Validate and deduplicate
- Split into train/validation sets

### 2. Fine-Tune

```bash
python3 finetune.py
```

Or with custom config:
```bash
python3 finetune.py --config ../configs/training_config.json
```

### 3. Run Streaming API

```bash
python3 streaming_api.py --model ../output/buildwellai-qwen3-14b-v2/merged --port 8080
```

Or interactive CLI:
```bash
python3 streaming_api.py --model ../output/buildwellai-qwen3-14b-v2/merged --cli
```

## Dataset Sources

| Source | Description | Count |
|--------|-------------|-------|
| qa-buildwell-ai.csv | Q&A pairs | ~39K |
| UK Building Control Benchmark | Building control questions | ~160 |
| BSI Flex 8670 | Building safety competence | ~47 |
| dataset_thinking_mode.jsonl | Reasoning examples | 10K |
| dataset_tool_calling.jsonl | Tool call examples | 8K |
| MCP Generated | All 42 MCP servers | ~250 |

## Anti-Overfitting Measures

1. **Lower Learning Rate**: 1e-5 (vs typical 2e-4)
2. **Weight Decay**: 0.05 L2 regularization
3. **LoRA Dropout**: 0.1 in adapter layers
4. **Early Stopping**: Patience of 3, monitors val_loss
5. **Validation Split**: 5% held out for monitoring
6. **Lower LoRA Rank**: r=16 reduces capacity
7. **Fewer Epochs**: 2 epochs max
8. **Gradient Clipping**: max_grad_norm=0.5

## Estimated Training Time

On 2x RTX A5000 (RunPod):

| Dataset Size | Method | Time | Cost |
|--------------|--------|------|------|
| ~60K samples | Unsloth QLoRA | ~12-15 hours | ~$13-16 |
| ~60K samples | HF Standard | ~25-30 hours | ~$25-30 |

## API Usage

### OpenAI-Compatible Endpoint

```python
import openai

client = openai.OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="buildwellai-qwen3-14b-v2",
    messages=[
        {"role": "user", "content": "What are the PSI values for junction E5?"}
    ],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")
```

### Direct Endpoint

```bash
curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Calculate U-value for cavity wall"}],
    "stream": false
  }'
```

## File Structure

```
buildwellai-model-v2/
├── configs/
│   └── training_config.json
├── datasets/
│   ├── train.jsonl
│   ├── validation.jsonl
│   └── dataset_stats.json
├── scripts/
│   ├── prepare_dataset.py
│   ├── finetune.py
│   └── streaming_api.py
├── output/
│   └── buildwellai-qwen3-14b-v2/
│       ├── adapter/
│       └── merged/
└── logs/
```

## MCP Servers Covered

All 42 BuildwellAI MCP calculation servers:

- Structural: Part A, disproportionate collapse
- Thermal: U-value, PSI, condensation, thermal break
- Energy: SAP10, SBEM, Part L, Passivhaus
- Fire: Safety, smoke ventilation, evacuation
- Sustainability: BREEAM, WELL, LCA, embodied carbon
- Water: Part G, drainage, SuDS, flood risk
- Comfort: Daylight, overheating, acoustics, ventilation
- And more...