MyVillage Project - Intent Router Model

This is a fine-tuned DistilBERT model designed to route user queries within the MyVillage Project (Coding in Color) chatbot ecosystem.

Unlike a standard chatbot that answers everything directly, this model acts as a Traffic Controller. It analyzes the user's metadata and conversation history (last 5 messages) to classify their intent into one of 6 organizational categories. The system then routes the request to the correct database or API endpoint (e.g., directing "Invoice questions" to the Finance System).

🎯 Intent Categories (Labels)

The model predicts one of the following 6 distinct topics:

Label ID Label Name Description Key Indicators (Examples)
0 FINANCIAL Money, Payments, Invoices "Where do I upload receipt?", "W9 form", "Reimbursement", "Vendor payment"
1 CIC_EVENTS Coding in Color Events "Student showcase", "Hackathon", "Robot demo", "Registration deadline"
2 CIC_ACTIVITIES Internal Dev Work "Slack check-in", "n8n workflow", "Pushing code", "Daily standup", "API error"
3 ORG_RESOURCES General Admin/IT Support "Lost password", "Employee handbook", "Laptop request", "HR contact"
4 ORG_EVENTS Strategic/Community Events "Board meeting", "Town hall", "Fundraising gala", "Demographic analysis"
5 STAFF_GRANTS Funding & Proposals "NSF proposal", "Grant submission", "Budget review", "Logic model", "Impact metrics"

πŸ“Š Model Performance

Training Results

The model achieved 100% Accuracy on the validation set by Epoch 2, demonstrating rapid convergence on the synthetic dataset.

Metric Score Note
Validation Accuracy 1.0000 Perfect memorization of validation patterns.
Validation Loss 0.0553 Extremely high confidence in predictions.

Real-World Inference Test

When tested on 30+ unseen edge cases (including trick questions and overlapping concepts), the model achieved:

  • Inference Accuracy: 90.91%
  • Known Weakness: The model occasionally confuses Logistics for Org Events (e.g., ordering lunch for a board meeting) with CIC Events (ordering pizza for students).
  • Strength: Excellent distinction between "Dev Work" (CIC_ACTIVITIES) and "IT Support" (ORG_RESOURCES).

πŸš€ How to Use

Crucial: This model expects a specific input format. You must concatenate the user's metadata and query history into a single string.

Input Format: Role: {role} | Name: {name} | ID: {id} | Phone: {phone} | Email: {email} | History: 'msg1', 'msg2', 'msg3', 'msg4', 'msg5'

Python Example

from transformers import pipeline

# 1. Load Model
router = pipeline("text-classification", model="your-username/myvillage-router-v1")

# 2. Formulate Input (Simulating a Director asking about Grants)
input_text = "Role: Director | Name: Sarah Boss | ID: 0012 | Phone: 555-0000 | Email: s.boss@mvp.org | History: 'Draft the narrative for the NSF proposal.', 'Review the budget section.', 'Did we get the funding?', 'Attach the logic model PDF.', 'When is the submission deadline?'"

# 3. Predict
result = router(input_text)

print(f"Routed To: {result[0]['label']} (Confidence: {result[0]['score']:.4f})")
# Output: STAFF_GRANTS (Confidence: 0.9823)

⚠️ Limitations

  • Context Window: The model relies heavily on the last 5 messages. If the intent is not clear in that window, accuracy may drop.
  • Synthetic Bias: The model was trained on synthetic data. While it handles natural language well, it may struggle with highly specific slang or typos not present in the training set.
  • Role vs. Content: The model is trained to prioritize Content over Role. (e.g., A "Director" asking about "Python Code" will be routed to CIC_ACTIVITIES, not STAFF_GRANTS).

πŸ› οΈ Training Data

The model was trained on 250+ synthetic examples generated to mimic the specific operational workflows of the MyVillage Project. The data includes:

  • Redundant History Patterns: Users often repeat intents in different ways.
  • Role Variation: Every intent is paired with every role (e.g., Admins asking Student questions) to prevent role-based overfitting.
Downloads last month
48
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support