BERT-Base-Uncased Fine-Tuned Model for Intent Classification on CLINC150 Dataset

This repository hosts a fine-tuned BERT model for multi-class intent classification using the CLINC150 (plus) dataset. The model is trained to classify user queries into 150 in-scope intents and handle out-of-scope (OOS) queries.

Model Details

Model Architecture: BERT Base Uncased
Task: Multi-class Intent Classification
Dataset: CLINC150 (plus variant)
Quantization: Float16
Fine-tuning Framework: Hugging Face Transformers

Installation

pip install transformers datasets scikit-learn evaluate

Loading the Model

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load tokenizer and model
model_path = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
# Define test sentences
test_sentences = [
    "Can you tell me the weather in New York?",
    "I want to transfer money to my friend",
    "Play some relaxing jazz music",
]


# Tokenize and predict
def predict_intent(sentences, model, tokenizer, id2label_fn, device="cpu"):
    if isinstance(sentences, str):
        sentences = [sentences]

    model.eval()
    model.to(device)

    inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt").to(device)

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        predictions = torch.argmax(logits, dim=-1)

    return [id2label_fn(label.item()) for label in predictions]

Performance Metrics

Accuracy: 0.947097
Precision: 0.949821
Recall: 0.947097
F1 Score: 0.945876

Fine-Tuning Details

Dataset

The CLINC150 (plus) dataset contains 151 intent classes (150 in-scope + 1 out-of-scope) for intent classification in English utterances. It includes 15k training, 3k validation, and 4.5k test examples with diverse user queries.

Training

Epochs: 5
Batch size: 16
Learning rate: 2e-5
Evaluation strategy: epoch

Quantization

Post-training quantization was applied using PyTorch’s half() precision (FP16) to reduce model size and inference time.

Repository Structure

.
├── quantized-model/               # Contains the quantized model files
│   ├── config.json
│   ├── model.safetensors
│   ├── tokenizer_config.json
│   ├── vocab.txt
│   └── special_tokens_map.json
├── README.md                      # Model documentation

Limitations

The model is trained specifically for multi classification on CLINIC150 Dataset.
FP16 quantization may result in slight numerical instability in edge cases.

Contributing

Feel free to open issues or submit pull requests to improve the model or documentation.

Downloads last month: 18

Safetensors

Model size

0.1B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support