LoRA: Low-Rank Adaptation of Large Language Models
Paper
•
2106.09685
•
Published
•
58
This is a parameter-efficient fine-tuned version of distilbert-base-uncased using the LoRA technique via PEFT. The model was trained for intent recognition using a custom dataset.
distilbert-base-uncasedYou can use this model to classify user intents in applications like chatbots, virtual assistants, or voice-based interfaces.
This model supports classification over 33 intent labels, including:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
from peft import PeftModel
import joblib
# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("hopjetair/intent")
base_model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased",num_labels=33)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "hopjetair/intent")
# Load label encoder
label_encoder = joblib.load("label_encoder.pkl")
# Inference
text = "Book me a flight to New York"
clf = pipeline("text-classification", model=model, tokenizer=tokenizer)
result = clf(text)[0]
label_num = int(result["label"].split("_")[-1])
# Convert back to label
predicted_label = label_encoder.inverse_transform([label_num])[0]
print(predicted_label)
On average hardware — actual performance may vary based on your system configuration.
| Task | CPU Inference Time | GPU Inference Time |
|---|---|---|
| Single sentence inference | ~100–200 ms | ~5–10 ms |
| Batch of 32 inputs | ~2–3 seconds total | ~100–300 ms total |
You can run DistilBERT inference on:
t3.medium, GCP e2-standard)Base model
distilbert/distilbert-base-uncased