| --- |
| language: en |
| license: mit |
| library_name: pytorch |
| tags: |
| - task-routing |
| - multi-task-learning |
| - foundation-model |
| - synthetic-data |
| - balanced-training |
| - software-engineering |
| metrics: |
| - accuracy |
| model-index: |
| - name: corch-v13-balanced |
| results: |
| - task: |
| type: text-classification |
| name: Task Routing |
| metrics: |
| - type: accuracy |
| value: 87.30 |
| name: Average Accuracy |
| - type: accuracy |
| value: 100.00 |
| name: Domain Accuracy |
| - type: accuracy |
| value: 100.00 |
| name: Capability Accuracy |
| --- |
| |
| # Corch V13 Balanced: Task Routing Foundation Model |
|
|
| **87.30% Average Accuracy** | Perfect Domain & Capability Classification |
|
|
| A multi-task foundation model for intelligent software engineering task routing, achieving breakthrough performance through balanced synthetic data generation. |
|
|
| ## Model Description |
|
|
| Corch V13 Balanced is a 805K parameter neural network that classifies software engineering tasks across 4 dimensions: |
|
|
| 1. **Domain** (19 classes): frontend, backend, machine_learning, etc. - **100% accuracy** 🎯 |
| 2. **Capability** (8 classes): code_generation, debugging, testing, etc. - **100% accuracy** 🎯 |
| 3. **Strategy** (2 classes): DIRECT vs ORCHESTRATE - **85.98% accuracy** |
| 4. **Execution Type** (5 classes): single_task, multi_step, etc. - **63.20% accuracy** |
|
|
| ## Performance |
|
|
| | Task | Accuracy | Improvement from V10 | |
| |------|----------|---------------------| |
| | **Average** | **87.30%** | +20.46% | |
| | **Domain** | **100.00%** 🎯 | +14.59% | |
| | **Capability** | **100.00%** 🎯 | +39.61% | |
| | **Strategy** | **85.98%** | +12.55% | |
| | **Execution** | **63.20%** | +7.94% | |
|
|
| ## Key Innovation: Balanced Synthetic Data |
|
|
| The breakthrough came from solving severe class imbalance (324:1 ratio): |
| - Generated **49,307 synthetic examples** using GPT-5-Pro |
| - Balanced dataset to ~10K examples per domain |
| - Eliminated rare class zero-accuracy problem |
|
|
| **Before balancing:** |
| - `machine_learning` domain: 88 examples → 0% accuracy |
| - `other` domain: 57 examples → 0% accuracy |
|
|
| **After balancing:** |
| - All domains: ~10K examples → 100% accuracy ✅ |
|
|
| ## Architecture |
|
|
| ``` |
| Input Text → BGE-large-en-v1.5 Embedding (1024d) |
| ↓ |
| Shared Layers: |
| - Linear(1024 → 512) + ReLU + Dropout(0.3) |
| - Linear(512 → 512) + ReLU + Dropout(0.3) |
| ↓ |
| Task-Specific Heads: |
| ├─ Strategy Head → Linear(512 → 2) |
| ├─ Capability Head → Linear(512 → 8) |
| ├─ Domain Head → Linear(512 → 19) |
| └─ Execution Head → Linear(512 → 5) |
| ``` |
|
|
| **Parameters:** 804,898 |
| **Training Time:** ~1 minute (30 epochs, early stopped) |
| **Hardware:** AMD MI300X GPU |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from transformers import AutoTokenizer, AutoModel |
| |
| # Load BGE embedding model |
| tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5") |
| embedding_model = AutoModel.from_pretrained("BAAI/bge-large-en-v1.5") |
| |
| # Load Corch V13 Balanced model |
| from huggingface_hub import hf_hub_download |
| model_path = hf_hub_download(repo_id="bledden/corch-v13-balanced", filename="model_v13_balanced.pt") |
| |
| # Initialize model |
| class FoundationModelV13(torch.nn.Module): |
| def __init__(self): |
| super().__init__() |
| self.shared = torch.nn.Sequential( |
| torch.nn.Linear(1024, 512), |
| torch.nn.ReLU(), |
| torch.nn.Dropout(0.3), |
| torch.nn.Linear(512, 512), |
| torch.nn.ReLU(), |
| torch.nn.Dropout(0.3) |
| ) |
| self.strategy_head = torch.nn.Linear(512, 2) |
| self.capability_head = torch.nn.Linear(512, 8) |
| self.domain_head = torch.nn.Linear(512, 19) |
| self.execution_head = torch.nn.Linear(512, 5) |
| |
| def forward(self, x): |
| shared = self.shared(x) |
| return { |
| 'strategy': self.strategy_head(shared), |
| 'capability': self.capability_head(shared), |
| 'domain': self.domain_head(shared), |
| 'execution': self.execution_head(shared) |
| } |
| |
| model = FoundationModelV13() |
| checkpoint = torch.load(model_path, weights_only=True) |
| model.load_state_dict(checkpoint['model_state_dict']) |
| model.eval() |
| |
| # Embed and predict |
| def route_task(task_text): |
| # Generate embedding |
| inputs = tokenizer(task_text, return_tensors="pt", truncation=True, max_length=512) |
| with torch.no_grad(): |
| embedding = embedding_model(**inputs).last_hidden_state[:, 0, :] |
| |
| # Get predictions |
| with torch.no_grad(): |
| outputs = model(embedding) |
| |
| strategy = ["DIRECT", "ORCHESTRATE"][outputs['strategy'].argmax().item()] |
| capability = ["code_generation", "debugging", "documentation", "optimization", |
| "refactoring", "testing", "design", "data_analysis"][outputs['capability'].argmax().item()] |
| domain = ["frontend", "backend", "data_processing", "machine_learning", "devops", |
| "testing", "security", "mobile", "data_engineering", "cloud", "database", |
| "api", "ui_ux", "general", "iot", "blockchain", "game_dev", "embedded", |
| "other"][outputs['domain'].argmax().item()] |
| execution = ["single_task", "multi_step", "iterative", "parallel", |
| "sequential"][outputs['execution'].argmax().item()] |
| |
| return { |
| "strategy": strategy, |
| "capability": capability, |
| "domain": domain, |
| "execution_type": execution |
| } |
| |
| # Example |
| result = route_task("Build a CNN image classifier using PyTorch for medical imaging") |
| print(result) |
| # { |
| # 'strategy': 'ORCHESTRATE', |
| # 'capability': 'code_generation', |
| # 'domain': 'machine_learning', # 100% confidence |
| # 'execution_type': 'multi_step' |
| # } |
| ``` |
|
|
| ## Training Data |
|
|
| - **Training set:** 31,592 examples (balanced) |
| - **Validation set:** 3,495 examples |
| - **Synthetic examples:** 49,307 (generated via GPT-5-Pro) |
| - **Real examples:** ~550K (existing dataset) |
| - **Final dataset:** Balanced to ~10K per domain |
|
|
| ### Synthetic Data Generation |
|
|
| Used GPT-5-Pro with domain-specific prompts: |
|
|
| ``` |
| Generate a realistic software engineering task for: {domain} |
| Required: {capability}, {execution_type}, {strategy} |
| Output: 1-3 sentence task description with realistic terminology |
| ``` |
|
|
| **Cost:** ~$500 for 49,307 examples |
| **Quality:** 100% unique, zero duplicates, validated schemas |
|
|
| ## Label Mappings |
|
|
| **Strategy (2):** DIRECT, ORCHESTRATE |
| **Capability (8):** code_generation, debugging, documentation, optimization, refactoring, testing, design, data_analysis |
| **Domain (19):** frontend, backend, data_processing, machine_learning, devops, testing, security, mobile, data_engineering, cloud, database, api, ui_ux, general, iot, blockchain, game_dev, embedded, other |
| **Execution (5):** single_task, multi_step, iterative, parallel, sequential |
| |
| ## Comparison to Baselines |
| |
| | Model | Architecture | Data | Avg Acc | Domain Acc | |
| |-------|--------------|------|---------|------------| |
| | Logistic Regression | Single-task | Imbalanced | 74.61% | 74.61% | |
| | V10 | Multi-task | Imbalanced | 66.84% | 85.41% | |
| | **V13 Balanced** | **Multi-task** | **Balanced** | **87.30%** | **100.00%** | |
| |
| ## Limitations |
| |
| - Execution type prediction (63.20%) still has room for improvement |
| - Context-independent (doesn't use conversation history yet) |
| - English-only |
| - Focused on software engineering tasks |
| |
| ## Citation |
| |
| ```bibtex |
| @software{corch_v13_balanced_2024, |
| title = {Corch V13 Balanced: Task Routing Foundation Model}, |
| author = {Bledden, Team}, |
| year = {2024}, |
| publisher = {Hugging Face}, |
| url = {https://huggingface.co/bledden/corch-v13-balanced}, |
| note = {87.30% accuracy via balanced synthetic data generation} |
| } |
| ``` |
| |
| ## License |
| |
| MIT License |
| |
| ## Links |
| |
| - **GitHub:** https://github.com/bledden/Corch_by_Fac |
| - **Release Notes:** [RELEASE_V13_BALANCED.md](https://github.com/bledden/Corch_by_Fac/blob/main/RELEASE_V13_BALANCED.md) |
| - **Training Script:** [train_v13_option5_balanced.py](https://github.com/bledden/Corch_by_Fac/blob/main/training/scripts/train_v13_option5_balanced.py) |
| |
| --- |
| |
| Built with ❤️ by the Corch Team | Powered by balanced synthetic data generation |
| |