Upload mmBERT LoRA adapter: mmBERT Intent Classifier with LoRA - 14-class MMLU-Pro classification (77.9% accuracy)

b6e5d70 verified 3 months ago

2.73 kB

license: apache-2.0
language:
  - en
  - zh
  - multilingual
library_name: peft
base_model: jhu-clsp/mmBERT-base
tags:
  - text-classification
  - intent-classification
  - mmbert
  - lora
  - multilingual
  - vllm-semantic-router
datasets:
  - TIGER-Lab/MMLU-Pro
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification

mmBERT Intent Classifier (LoRA Adapter)

A multilingual intent classification model based on mmBERT (Multilingual ModernBERT) with LoRA adapters for efficient inference.

Model Description

This model classifies text into 14 MMLU-Pro academic categories using a LoRA-enhanced mmBERT backbone. It supports 1800+ languages through mmBERT's multilingual pretraining.

Performance

Metric	Score
Accuracy	77.9%
F1 (weighted)	78.0%
Training Time	139 seconds (MI300X GPU)

Training Details

Base Model: jhu-clsp/mmBERT-base
LoRA Rank: 32
LoRA Alpha: 64
Trainable Parameters: 6.8M / 314M (2.2%)
Epochs: 10
Batch Size: 64
Learning Rate: 2e-5
Dataset: TIGER-Lab/MMLU-Pro (9,144 samples)

Usage

from peft import PeftModel
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load base model and LoRA adapter
base_model = AutoModelForSequenceClassification.from_pretrained(
    "jhu-clsp/mmBERT-base",
    num_labels=14
)
model = PeftModel.from_pretrained(base_model, "llm-semantic-router/mmbert-intent-classifier-lora")
tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/mmBERT-base")

# Classify
text = "What are the legal requirements for forming a corporation?"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(-1).item()

Multilingual Support

This model supports cross-lingual transfer:

Fine-tuned on English MMLU-Pro data
Can classify queries in 1800+ languages
Best performance on English, good transfer to Chinese, Spanish, French, German, etc.

Part of vLLM Semantic Router

This model is part of the vLLM Semantic Router project - a Mixture-of-Models (MoM) router that understands request intent.

Citation

@misc{mmbert-intent-classifier,
  author = {vLLM Semantic Router Team},
  title = {mmBERT Intent Classifier with LoRA},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/llm-semantic-router/mmbert-intent-classifier-lora}
}

License

Apache 2.0

llm-semantic-router
/

mmbert-intent-classifier-lora