File size: 7,352 Bytes

---
language:
- en
license: mit
tags:
- code
- routing
- classification
- multi-task-learning
- software-development
- codebert
base_model: microsoft/codebert-base
model-index:
- name: facilitair-codebert-routing-v1
  results:
  - task:
      type: text-classification
      name: Text Classification
    metrics:
    - type: accuracy
      value: 99.93
      name: Accuracy
datasets:
- facilitair/routing-dataset-v1
pipeline_tag: text-classification
widget:
- text: "Build a React component for user authentication"
  example_title: "Frontend Task"
- text: "Fix database connection pool timeout error"
  example_title: "Database Task"
- text: "Deploy Docker container to AWS ECS"
  example_title: "DevOps Task"
- text: "Train neural network on customer data"
  example_title: "ML Task"
---

# Facilitair CodeBERT Routing Model v1

**Accuracy**: 99.93% (validation)
**Task**: Multi-task routing for software development tasks
**License**: MIT
**Base Model**: microsoft/codebert-base (125M parameters)

---

## Model Description

This model routes software development tasks to appropriate domains, strategies, capabilities, and execution types with 99.93% accuracy on technical tasks.

### Capabilities

The model performs 4 simultaneous predictions:

1. **Domain Classification** (19 classes):
   - frontend, backend, data, ml, devops, mobile, cloud, security
   - general, testing, database, infrastructure, api, microservices
   - blockchain, networking, embedded, gaming, system_design

2. **Strategy Classification** (2 classes):
   - DIRECT: Execute immediately
   - ORCHESTRATE: Complex multi-step execution

3. **Capability Detection** (8 multi-label):
   - code_generation, debugging, testing, refactoring
   - optimization, documentation, deployment, data_analysis

4. **Execution Type** (5 classes):
   - single_task, multi_step, iterative, parallel, sequential

### Performance

| Metric | Score |
|--------|-------|
| Overall Accuracy | 99.93% |
| Minimum Per-Domain | 99.1% (backend) |
| Perfect Domains | 17/19 (100.0%) |
| Training Time | 4.7 hours on AMD MI300X |
| Model Size | 477MB |

---

## Usage

### Python (Transformers)

```python
import torch
from transformers import RobertaTokenizer, RobertaModel

# Load model and tokenizer
model = RobertaModel.from_pretrained("somethingobscurefordevstuff/facilitair-codebert-routing-v1")
tokenizer = RobertaTokenizer.from_pretrained("microsoft/codebert-base")

# Load trained weights
checkpoint = torch.load("codebert_best_model.pt")
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Tokenize input
task = "Build a React component for user login"
encoding = tokenizer(task, max_length=512, padding='max_length', truncation=True, return_tensors='pt')

# Predict
with torch.no_grad():
    domain_logits, strategy_logits, capability_logits, execution_logits = model(
        encoding['input_ids'],
        encoding['attention_mask']
    )

    # Get domain prediction
    domain_idx = torch.argmax(domain_logits, dim=1).item()
    domains = ["frontend", "backend", "data", "ml", "devops", "mobile", "cloud", "security",
               "general", "testing", "database", "infrastructure", "api", "microservices",
               "blockchain", "networking", "embedded", "gaming", "system_design"]
    print(f"Domain: {domains[domain_idx]}")
```

### Using Facilitair Inference API

```python
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="somethingobscurefordevstuff/facilitair-codebert-routing-v1",
    filename="codebert_best_model.pt"
)

# Use with Facilitair's inference code
from facilitair_inference import CodeBERTRouter

router = CodeBERTRouter(model_path=model_path)
result = router.route_task("Build a React component")

print(f"Domain: {result['domain']}")  # frontend
print(f"Confidence: {result['domain_confidence']:.1%}")  # 95.8%
print(f"Strategy: {result['strategy']}")  # DIRECT
print(f"Capabilities: {result['capabilities']}")  # ['code_generation']
```

---

## Training Data

- **Size**: 149,986 examples
- **Distribution**: Perfectly balanced across 19 domains (7,894 per domain)
- **Task Types**:
  - 66.6% short (3-8 words)
  - 33.3% medium (10-20 words)
  - 0.1% long (30-50 words)
- **Domains**: All technical domains (frontend, backend, DevOps, ML, etc.)
- **Note**: Not trained on non-coding tasks (meetings, business analysis, etc.)

---

## Model Architecture

```
CodeBERT Base (microsoft/codebert-base)
├── 12 transformer layers
├── 768 hidden size
├── 12 attention heads
└── 125M total parameters

Classification Heads:
├── Domain Head: 768 → 256 → 19
├── Strategy Head: 768 → 256 → 2
├── Capability Head: 768 → 256 → 8 (multi-label)
└── Execution Head: 768 → 256 → 5
```

---

## Training Details

- **Base Model**: microsoft/codebert-base
- **Training Examples**: 149,986 (135K train, 15K validation)
- **Epochs**: 10 (early stopping triggered)
- **Best Epoch**: 4 (validation loss: 0.2146)
- **Batch Size**: 16
- **Learning Rate**: 2e-5
- **Optimizer**: AdamW with warmup
- **Hardware**: AMD MI300X (192GB HBM3)
- **Training Time**: 4.7 hours

### Loss Weighting

- Domain: 50%
- Capability: 25%
- Strategy: 15%
- Execution: 10%

---

## Evaluation Results

### Per-Domain Accuracy (Validation Set)

| Domain | Accuracy | Examples |
|--------|----------|----------|
| frontend | 100.0% | 790 |
| backend | 99.1% | 790 |
| data | 100.0% | 790 |
| ml | 100.0% | 790 |
| devops | 99.6% | 790 |
| mobile | 100.0% | 790 |
| cloud | 100.0% | 790 |
| security | 100.0% | 790 |
| general | 100.0% | 790 |
| testing | 100.0% | 790 |
| database | 100.0% | 790 |
| infrastructure | 99.8% | 790 |
| api | 100.0% | 790 |
| microservices | 100.0% | 790 |
| blockchain | 100.0% | 790 |
| networking | 100.0% | 790 |
| embedded | 100.0% | 790 |
| gaming | 100.0% | 790 |
| system_design | 100.0% | 790 |

**Summary**: 17/19 domains perfect (100%), minimum 99.1%

---

## Limitations

1. **Non-Coding Tasks**: Model is trained exclusively on technical software development tasks. It may misclassify:
   - Business analysis tasks
   - Meeting scheduling
   - Document writing
   - General Q&A

2. **Confidence Thresholds**: For production use, consider applying a confidence threshold (e.g., 70%) and fallback to "general" domain for uncertain predictions.

3. **Domain Overlap**: Some tasks may legitimately belong to multiple domains. Model predicts single most likely domain.

---

## Citation

If you use this model, please cite:

```bibtex
@software{facilitair_codebert_routing_2025,
  title={Facilitair CodeBERT Routing Model v1},
  author={Facilitair Team},
  year={2025},
  url={https://huggingface.co/somethingobscurefordevstuff/facilitair-codebert-routing-v1}
}
```

---

## License

MIT License - Free for commercial use

---

## Contact

- **Repository**: https://github.com/facilitair/codebert-routing
- **Issues**: https://github.com/facilitair/codebert-routing/issues
- **Website**: https://beta.facilitair.ai

---

## Version History

### v1.0.0 (2025-11-17)
- Initial release
- 99.93% validation accuracy
- 19 domains, 2 strategies, 8 capabilities, 5 execution types
- Trained on 150K balanced examples

---

**Model Card**: [Full Model Card](model-card.md)
**Training Details**: [Training Report](training-report.md)