---
language: en
license: apache-2.0
tags:
- text-classification
- intent-classification
- conversational-ai
- bert
- distilbert
datasets:
- custom
metrics:
- accuracy
- f1
model-index:
- name: intent-classifier-v2
  results:
  - task:
      type: text-classification
      name: Intent Classification
    metrics:
    - type: accuracy
      value: 1.0000
      name: Test Accuracy
    - type: f1
      value: 1.0000
      name: Weighted F1
---

# DAPA Intent Classifier v2

## Model Description

This model classifies user intents for the DAPA AI conversational assistant system. It supports 13 different intents including 12 agentic workflows and 1 general Q&A fallback.

**Model Type:** DistilBERT for Sequence Classification  
**Training Date:** 2025-10-25  
**Accuracy:** 100.00%  
**F1 Score:** 1.0000

## Supported Intents

The model classifies queries into 13 intents:

### Agentic Intents (12)
1. **generate-offer** - Generate job offers, NDAs, contracts
2. **schedule-interview** - Schedule candidate interviews
3. **update-employee-profile** - Update employee information
4. **access-employee-record** - Access employee records
5. **approve-expense** - Approve expense reports
6. **check-leave-balance** - Check leave balances
7. **confirm-training-completion** - Confirm training completion
8. **provide-candidate-feedback** - Provide candidate feedback
9. **request-leave** - Request time off
10. **request-training** - Request training enrollment
11. **review-contract** - Review contracts
12. **submit-expense** - Submit expense reports

### Q&A Intent (1)
13. **general-query** - Generic queries (email lookups, status checks, policy questions)

## Training Data

- **Total Samples:** 1,240
- **Training Split:** 70% (868 samples)
- **Validation Split:** 15% (186 samples)
- **Test Split:** 15% (186 samples)
- **Data Balance:** 80-200 examples per intent

## Performance

### Overall Metrics
- **Test Accuracy:** 100.00%
- **Weighted Precision:** 1.0000
- **Weighted Recall:** 1.0000
- **Weighted F1:** 1.0000

### Per-Intent Performance

| Intent | Precision | Recall | F1 | Support |
|--------|-----------|--------|-----|---------|
| access-employee-record | 1.000 | 1.000 | 1.000 | 12 |\n| approve-expense | 1.000 | 1.000 | 1.000 | 12 |\n| check-leave-balance | 1.000 | 1.000 | 1.000 | 12 |\n| confirm-training-completion | 1.000 | 1.000 | 1.000 | 12 |\n| general-query | 1.000 | 1.000 | 1.000 | 30 |\n| generate-offer | 1.000 | 1.000 | 1.000 | 18 |\n| provide-candidate-feedback | 1.000 | 1.000 | 1.000 | 12 |\n| request-leave | 1.000 | 1.000 | 1.000 | 12 |\n| request-training | 1.000 | 1.000 | 1.000 | 12 |\n| review-contract | 1.000 | 1.000 | 1.000 | 12 |\n| schedule-interview | 1.000 | 1.000 | 1.000 | 15 |\n| submit-expense | 1.000 | 1.000 | 1.000 | 12 |\n| update-employee-profile | 1.000 | 1.000 | 1.000 | 15 |\n

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("SantmanKT/intent-classifier-v2")
model = AutoModelForSequenceClassification.from_pretrained("SantmanKT/intent-classifier-v2")

# Predict intent
query = "send offer to John"
context = "{domain: hr}"
input_text = f"{query} [context: {context}]"

inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=128)
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
confidence, predicted_idx = torch.max(probs, dim=-1)

intent = model.config.id2label[predicted_idx.item()]
print(f"Intent: {intent}, Confidence: {confidence.item():.2%}")
```

## Routing Logic

- **High Confidence (≥70%) + Agentic Intent** → Route to Domain Service
- **Low Confidence (<70%) OR general-query** → Route to Q&A Service

## Model Details

- **Base Model:** distilbert-base-uncased
- **Max Sequence Length:** 128 tokens
- **Training Epochs:** 5
- **Batch Size:** 16
- **Learning Rate:** 2e-5
- **Framework:** HuggingFace Transformers

## Limitations

- Optimized for English language only
- Requires context formatting: `[context: {...}]`
- Performance may degrade on queries significantly different from training data

## Citation

If you use this model, please cite:

```
@misc{dapa-intent-classifier-v2,
  author = {SantmanKT},
  title = {DAPA Intent Classifier v2},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/SantmanKT/intent-classifier-v2}
}
```

## License

Apache 2.0