IntentRouter/informant-67M-automotive

Model Description

This is an intent classification model fine-tuned for automotive domain routing to MCP (Model Context Protocol) servers. The model efficiently routes user queries to appropriate services with ~10ms inference time.

Model Details

  • Base Model: distilbert-base-uncased
  • Parameters: 0.0M (0 total)
  • Domain: Automotive
  • Intent Classes: 4
  • Validation Accuracy: 93.3%
  • Framework: PyTorch + Transformers
  • License: MIT

Intended Use

This model is designed to route user queries in automotive applications to appropriate MCP servers based on intent classification. It replaces generic LLM routing with fast, specialized classification.

Primary Use Cases:

  • Intent routing for automotive chatbots
  • Query classification for customer service
  • MCP server selection for specialized tasks
  • Real-time user request routing

Supported Intents

The model can classify the following intents:

  • diagnostic_engine
  • knowledge_general
  • maintenance_schedule
  • parts_search

MCP Server Routing

The model routes intents to the following MCP servers:

Usage

Direct Inference

from transformers import AutoTokenizer, AutoModel
import torch
import json

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("IntentRouter/informant-67M-automotive")

# Load custom classifier (you'll need the IntentClassifier class)
# See: https://github.com/your-org/intent-router

# Load model configuration
with open("model_config.json", "r") as f:
    config = json.load(f)

label_to_id = config["labels"]["label_to_id"]
id_to_label = config["labels"]["id_to_label"]

# Predict intent
def predict_intent(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.softmax(outputs.logits, dim=-1)
        predicted_id = torch.argmax(predictions, dim=-1).item()
        confidence = predictions[0][predicted_id].item()
    
    intent = id_to_label[str(predicted_id)]
    return intent, confidence

# Example usage
intent, confidence = predict_intent("My engine is making strange noises")
print(f"Intent: {intent} (confidence: {confidence:.3f})")

Using the Intent Router System

# Install the intent router system
git clone https://github.com/your-org/intent-router
cd intent-router

# Use the model
python intent_inference.py \
    --model-id IntentRouter/informant-67M-automotive \
    --text "Your query here" \
    --confidence

# Start API server
python intent_inference.py \
    --model-id IntentRouter/informant-67M-automotive \
    --serve --port 8000

API Usage

# Single prediction
curl -X POST "http://localhost:8000/predict" \
     -H "Content-Type: application/json" \
     -d '{"text": "Your query here", "return_confidence": true}'

# Batch prediction
curl -X POST "http://localhost:8000/predict/batch" \
     -H "Content-Type: application/json" \
     -d '{"texts": ["Query 1", "Query 2"], "return_confidence": true}'

Training Details

Training Data

  • Domain: Automotive
  • Total Examples: 223
  • Generation Method: Config-driven with linguistic variations
  • Data Source: automotive

Training Procedure

  • Base Model: distilbert-base-uncased
  • Training Framework: Transformers Trainer
  • Hardware: cpu
  • Training Time: 20250711_113829

Training Hyperparameters

  • Learning Rate: 2e-05
  • Batch Size: 16
  • Epochs: 5
  • Max Length: 128
  • Dropout: 0.1

Evaluation

Metrics

  • Validation Accuracy: 93.3%
  • Model Size: 0.0M parameters
  • Inference Speed: ~10-50ms per prediction

Performance Benchmarks

Metric Value
Validation Accuracy 93.3%
Parameters 0.0M
Inference Time ~10-20ms
Memory Usage ~200MB

Model Architecture

This model uses a standard transformer encoder with a classification head:

  1. Base Transformer: distilbert-base-uncased
  2. Classification Head: Linear layer (uncased โ†’ 4)
  3. Dropout: Regularization layer

Limitations and Bias

Limitations

  • Trained specifically for automotive domain - may not generalize to other domains
  • Performance depends on similarity to training data
  • Requires retraining for new intents
  • Limited to English language

Bias Considerations

  • Training data may reflect biases in automotive terminology
  • Model performance may vary across different user populations
  • Regular evaluation recommended for production use

Environmental Impact

  • Training Hardware: cpu
  • Training Time: Estimated 0.1 hours
  • Carbon Footprint: Minimal (efficient training on smaller model)

Citation

@misc{IntentRouter_informant_67M_automotive,
  title={IntentRouter/informant-67M-automotive: Intent Classification for Automotive MCP Routing},
  author={Intent Router System},
  year={2025},
  howpublished={\url{https://huggingface.co/IntentRouter/informant-67M-automotive}},
  note={Fine-tuned distilbert-base-uncased for intent classification}
}

License

This model is released under the MIT License. See LICENSE for more details.

Model Card Contact

For questions about this model, please open an issue in the Intent Router repository.

Downloads last month
18
Safetensors
Model size
66.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for IntentRouter/informant-67M-automotive

Finetuned
(10487)
this model

Evaluation results

  • Validation Accuracy on Automotive Intent Dataset
    self-reported
    0.933