NotNow

Upload README.md with huggingface_hub

c268593 verified 6 months ago

7.35 kB

	---
	language:
	- en
	license: mit
	tags:
	- code
	- routing
	- classification
	- multi-task-learning
	- software-development
	- codebert
	base_model: microsoft/codebert-base
	model-index:
	- name: facilitair-codebert-routing-v1
	results:
	- task:
	type: text-classification
	name: Text Classification
	metrics:
	- type: accuracy
	value: 99.93
	name: Accuracy
	datasets:
	- facilitair/routing-dataset-v1
	pipeline_tag: text-classification
	widget:
	- text: "Build a React component for user authentication"
	example_title: "Frontend Task"
	- text: "Fix database connection pool timeout error"
	example_title: "Database Task"
	- text: "Deploy Docker container to AWS ECS"
	example_title: "DevOps Task"
	- text: "Train neural network on customer data"
	example_title: "ML Task"
	---

	# Facilitair CodeBERT Routing Model v1

	Accuracy: 99.93% (validation)
	Task: Multi-task routing for software development tasks
	License: MIT
	Base Model: microsoft/codebert-base (125M parameters)

	---

	## Model Description

	This model routes software development tasks to appropriate domains, strategies, capabilities, and execution types with 99.93% accuracy on technical tasks.

	### Capabilities

	The model performs 4 simultaneous predictions:

	1. Domain Classification (19 classes):
	- frontend, backend, data, ml, devops, mobile, cloud, security
	- general, testing, database, infrastructure, api, microservices
	- blockchain, networking, embedded, gaming, system_design

	2. Strategy Classification (2 classes):
	- DIRECT: Execute immediately
	- ORCHESTRATE: Complex multi-step execution

	3. Capability Detection (8 multi-label):
	- code_generation, debugging, testing, refactoring
	- optimization, documentation, deployment, data_analysis

	4. Execution Type (5 classes):
	- single_task, multi_step, iterative, parallel, sequential

	### Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Overall Accuracy \| 99.93% \|
	\| Minimum Per-Domain \| 99.1% (backend) \|
	\| Perfect Domains \| 17/19 (100.0%) \|
	\| Training Time \| 4.7 hours on AMD MI300X \|
	\| Model Size \| 477MB \|

	---

	## Usage

	### Python (Transformers)

	```python
	import torch
	from transformers import RobertaTokenizer, RobertaModel

	# Load model and tokenizer
	model = RobertaModel.from_pretrained("somethingobscurefordevstuff/facilitair-codebert-routing-v1")
	tokenizer = RobertaTokenizer.from_pretrained("microsoft/codebert-base")

	# Load trained weights
	checkpoint = torch.load("codebert_best_model.pt")
	model.load_state_dict(checkpoint['model_state_dict'])
	model.eval()

	# Tokenize input
	task = "Build a React component for user login"
	encoding = tokenizer(task, max_length=512, padding='max_length', truncation=True, return_tensors='pt')

	# Predict
	with torch.no_grad():
	domain_logits, strategy_logits, capability_logits, execution_logits = model(
	encoding['input_ids'],
	encoding['attention_mask']
	)

	# Get domain prediction
	domain_idx = torch.argmax(domain_logits, dim=1).item()
	domains = ["frontend", "backend", "data", "ml", "devops", "mobile", "cloud", "security",
	"general", "testing", "database", "infrastructure", "api", "microservices",
	"blockchain", "networking", "embedded", "gaming", "system_design"]
	print(f"Domain: {domains[domain_idx]}")
	```

	### Using Facilitair Inference API

	```python
	from huggingface_hub import hf_hub_download

	# Download model
	model_path = hf_hub_download(
	repo_id="somethingobscurefordevstuff/facilitair-codebert-routing-v1",
	filename="codebert_best_model.pt"
	)

	# Use with Facilitair's inference code
	from facilitair_inference import CodeBERTRouter

	router = CodeBERTRouter(model_path=model_path)
	result = router.route_task("Build a React component")

	print(f"Domain: {result['domain']}") # frontend
	print(f"Confidence: {result['domain_confidence']:.1%}") # 95.8%
	print(f"Strategy: {result['strategy']}") # DIRECT
	print(f"Capabilities: {result['capabilities']}") # ['code_generation']
	```

	---

	## Training Data

	- Size: 149,986 examples
	- Distribution: Perfectly balanced across 19 domains (7,894 per domain)
	- Task Types:
	- 66.6% short (3-8 words)
	- 33.3% medium (10-20 words)
	- 0.1% long (30-50 words)
	- Domains: All technical domains (frontend, backend, DevOps, ML, etc.)
	- Note: Not trained on non-coding tasks (meetings, business analysis, etc.)

	---

	## Model Architecture

	```
	CodeBERT Base (microsoft/codebert-base)
	├── 12 transformer layers
	├── 768 hidden size
	├── 12 attention heads
	└── 125M total parameters

	Classification Heads:
	├── Domain Head: 768 → 256 → 19
	├── Strategy Head: 768 → 256 → 2
	├── Capability Head: 768 → 256 → 8 (multi-label)
	└── Execution Head: 768 → 256 → 5
	```

	---

	## Training Details

	- Base Model: microsoft/codebert-base
	- Training Examples: 149,986 (135K train, 15K validation)
	- Epochs: 10 (early stopping triggered)
	- Best Epoch: 4 (validation loss: 0.2146)
	- Batch Size: 16
	- Learning Rate: 2e-5
	- Optimizer: AdamW with warmup
	- Hardware: AMD MI300X (192GB HBM3)
	- Training Time: 4.7 hours

	### Loss Weighting

	- Domain: 50%
	- Capability: 25%
	- Strategy: 15%
	- Execution: 10%

	---

	## Evaluation Results

	### Per-Domain Accuracy (Validation Set)

	\| Domain \| Accuracy \| Examples \|
	\|--------\|----------\|----------\|
	\| frontend \| 100.0% \| 790 \|
	\| backend \| 99.1% \| 790 \|
	\| data \| 100.0% \| 790 \|
	\| ml \| 100.0% \| 790 \|
	\| devops \| 99.6% \| 790 \|
	\| mobile \| 100.0% \| 790 \|
	\| cloud \| 100.0% \| 790 \|
	\| security \| 100.0% \| 790 \|
	\| general \| 100.0% \| 790 \|
	\| testing \| 100.0% \| 790 \|
	\| database \| 100.0% \| 790 \|
	\| infrastructure \| 99.8% \| 790 \|
	\| api \| 100.0% \| 790 \|
	\| microservices \| 100.0% \| 790 \|
	\| blockchain \| 100.0% \| 790 \|
	\| networking \| 100.0% \| 790 \|
	\| embedded \| 100.0% \| 790 \|
	\| gaming \| 100.0% \| 790 \|
	\| system_design \| 100.0% \| 790 \|

	Summary: 17/19 domains perfect (100%), minimum 99.1%

	---

	## Limitations

	1. Non-Coding Tasks: Model is trained exclusively on technical software development tasks. It may misclassify:
	- Business analysis tasks
	- Meeting scheduling
	- Document writing
	- General Q&A

	2. Confidence Thresholds: For production use, consider applying a confidence threshold (e.g., 70%) and fallback to "general" domain for uncertain predictions.

	3. Domain Overlap: Some tasks may legitimately belong to multiple domains. Model predicts single most likely domain.

	---

	## Citation

	If you use this model, please cite:

	```bibtex
	@software{facilitair_codebert_routing_2025,
	title={Facilitair CodeBERT Routing Model v1},
	author={Facilitair Team},
	year={2025},
	url={https://huggingface.co/somethingobscurefordevstuff/facilitair-codebert-routing-v1}
	}
	```

	---

	## License

	MIT License - Free for commercial use

	---

	## Contact

	- Repository: https://github.com/facilitair/codebert-routing
	- Issues: https://github.com/facilitair/codebert-routing/issues
	- Website: https://beta.facilitair.ai

	---

	## Version History

	### v1.0.0 (2025-11-17)
	- Initial release
	- 99.93% validation accuracy
	- 19 domains, 2 strategies, 8 capabilities, 5 execution types
	- Trained on 150K balanced examples

	---

	Model Card: [Full Model Card](model-card.md)
	Training Details: [Training Report](training-report.md)