---
language: en
license: mit
library_name: pytorch
tags:
- task-routing
- multi-task-learning
- foundation-model
- synthetic-data
- balanced-training
- software-engineering
metrics:
- accuracy
model-index:
- name: corch-v13-balanced
  results:
  - task:
      type: text-classification
      name: Task Routing
    metrics:
    - type: accuracy
      value: 87.30
      name: Average Accuracy
    - type: accuracy
      value: 100.00
      name: Domain Accuracy
    - type: accuracy
      value: 100.00
      name: Capability Accuracy
---

# Corch V13 Balanced: Task Routing Foundation Model

**87.30% Average Accuracy** | Perfect Domain & Capability Classification

A multi-task foundation model for intelligent software engineering task routing, achieving breakthrough performance through balanced synthetic data generation.

## Model Description

Corch V13 Balanced is a 805K parameter neural network that classifies software engineering tasks across 4 dimensions:

1. **Domain** (19 classes): frontend, backend, machine_learning, etc. - **100% accuracy** 🎯
2. **Capability** (8 classes): code_generation, debugging, testing, etc. - **100% accuracy** 🎯  
3. **Strategy** (2 classes): DIRECT vs ORCHESTRATE - **85.98% accuracy**
4. **Execution Type** (5 classes): single_task, multi_step, etc. - **63.20% accuracy**

## Performance

| Task | Accuracy | Improvement from V10 |
|------|----------|---------------------|
| **Average** | **87.30%** | +20.46% |
| **Domain** | **100.00%** 🎯 | +14.59% |
| **Capability** | **100.00%** 🎯 | +39.61% |
| **Strategy** | **85.98%** | +12.55% |
| **Execution** | **63.20%** | +7.94% |

## Key Innovation: Balanced Synthetic Data

The breakthrough came from solving severe class imbalance (324:1 ratio):
- Generated **49,307 synthetic examples** using GPT-5-Pro
- Balanced dataset to ~10K examples per domain
- Eliminated rare class zero-accuracy problem

**Before balancing:**
- `machine_learning` domain: 88 examples → 0% accuracy
- `other` domain: 57 examples → 0% accuracy

**After balancing:**
- All domains: ~10K examples → 100% accuracy ✅

## Architecture

```
Input Text → BGE-large-en-v1.5 Embedding (1024d)
            ↓
Shared Layers:
  - Linear(1024 → 512) + ReLU + Dropout(0.3)
  - Linear(512 → 512) + ReLU + Dropout(0.3)
            ↓
Task-Specific Heads:
  ├─ Strategy Head → Linear(512 → 2)
  ├─ Capability Head → Linear(512 → 8)
  ├─ Domain Head → Linear(512 → 19)
  └─ Execution Head → Linear(512 → 5)
```

**Parameters:** 804,898  
**Training Time:** ~1 minute (30 epochs, early stopped)  
**Hardware:** AMD MI300X GPU

## Usage

```python
import torch
from transformers import AutoTokenizer, AutoModel

# Load BGE embedding model
tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5")
embedding_model = AutoModel.from_pretrained("BAAI/bge-large-en-v1.5")

# Load Corch V13 Balanced model
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="bledden/corch-v13-balanced", filename="model_v13_balanced.pt")

# Initialize model
class FoundationModelV13(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.shared = torch.nn.Sequential(
            torch.nn.Linear(1024, 512),
            torch.nn.ReLU(),
            torch.nn.Dropout(0.3),
            torch.nn.Linear(512, 512),
            torch.nn.ReLU(),
            torch.nn.Dropout(0.3)
        )
        self.strategy_head = torch.nn.Linear(512, 2)
        self.capability_head = torch.nn.Linear(512, 8)
        self.domain_head = torch.nn.Linear(512, 19)
        self.execution_head = torch.nn.Linear(512, 5)
    
    def forward(self, x):
        shared = self.shared(x)
        return {
            'strategy': self.strategy_head(shared),
            'capability': self.capability_head(shared),
            'domain': self.domain_head(shared),
            'execution': self.execution_head(shared)
        }

model = FoundationModelV13()
checkpoint = torch.load(model_path, weights_only=True)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Embed and predict
def route_task(task_text):
    # Generate embedding
    inputs = tokenizer(task_text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        embedding = embedding_model(**inputs).last_hidden_state[:, 0, :]
    
    # Get predictions
    with torch.no_grad():
        outputs = model(embedding)
    
    strategy = ["DIRECT", "ORCHESTRATE"][outputs['strategy'].argmax().item()]
    capability = ["code_generation", "debugging", "documentation", "optimization",
                  "refactoring", "testing", "design", "data_analysis"][outputs['capability'].argmax().item()]
    domain = ["frontend", "backend", "data_processing", "machine_learning", "devops",
              "testing", "security", "mobile", "data_engineering", "cloud", "database",
              "api", "ui_ux", "general", "iot", "blockchain", "game_dev", "embedded", 
              "other"][outputs['domain'].argmax().item()]
    execution = ["single_task", "multi_step", "iterative", "parallel", 
                 "sequential"][outputs['execution'].argmax().item()]
    
    return {
        "strategy": strategy,
        "capability": capability,
        "domain": domain,
        "execution_type": execution
    }

# Example
result = route_task("Build a CNN image classifier using PyTorch for medical imaging")
print(result)
# {
#   'strategy': 'ORCHESTRATE',
#   'capability': 'code_generation',
#   'domain': 'machine_learning',  # 100% confidence
#   'execution_type': 'multi_step'
# }
```

## Training Data

- **Training set:** 31,592 examples (balanced)
- **Validation set:** 3,495 examples
- **Synthetic examples:** 49,307 (generated via GPT-5-Pro)
- **Real examples:** ~550K (existing dataset)
- **Final dataset:** Balanced to ~10K per domain

### Synthetic Data Generation

Used GPT-5-Pro with domain-specific prompts:

```
Generate a realistic software engineering task for: {domain}
Required: {capability}, {execution_type}, {strategy}
Output: 1-3 sentence task description with realistic terminology
```

**Cost:** ~$500 for 49,307 examples  
**Quality:** 100% unique, zero duplicates, validated schemas

## Label Mappings

**Strategy (2):** DIRECT, ORCHESTRATE  
**Capability (8):** code_generation, debugging, documentation, optimization, refactoring, testing, design, data_analysis  
**Domain (19):** frontend, backend, data_processing, machine_learning, devops, testing, security, mobile, data_engineering, cloud, database, api, ui_ux, general, iot, blockchain, game_dev, embedded, other  
**Execution (5):** single_task, multi_step, iterative, parallel, sequential

## Comparison to Baselines

| Model | Architecture | Data | Avg Acc | Domain Acc |
|-------|--------------|------|---------|------------|
| Logistic Regression | Single-task | Imbalanced | 74.61% | 74.61% |
| V10 | Multi-task | Imbalanced | 66.84% | 85.41% |
| **V13 Balanced** | **Multi-task** | **Balanced** | **87.30%** | **100.00%** |

## Limitations

- Execution type prediction (63.20%) still has room for improvement
- Context-independent (doesn't use conversation history yet)
- English-only
- Focused on software engineering tasks

## Citation

```bibtex
@software{corch_v13_balanced_2024,
  title = {Corch V13 Balanced: Task Routing Foundation Model},
  author = {Bledden, Team},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/bledden/corch-v13-balanced},
  note = {87.30% accuracy via balanced synthetic data generation}
}
```

## License

MIT License

## Links

- **GitHub:** https://github.com/bledden/Corch_by_Fac
- **Release Notes:** [RELEASE_V13_BALANCED.md](https://github.com/bledden/Corch_by_Fac/blob/main/RELEASE_V13_BALANCED.md)
- **Training Script:** [train_v13_option5_balanced.py](https://github.com/bledden/Corch_by_Fac/blob/main/training/scripts/train_v13_option5_balanced.py)

---

Built with ❤️ by the Corch Team | Powered by balanced synthetic data generation