KDLLM Teacher Model β Fine-tuned BERT (IMDb Sentiment Classification)
This repository hosts the Teacher Model used in the KDLLM framework from:
KDLLM: Knowledge Distillation for Compressed and Copyright-Safe Large Language Model Sharing
Shiva Shrestha et al.
Tsinghua Science and Technology, 2025
Manuscript ID: TST-2025-0253
π Overview
The teacher model is based on bert-base-uncased, fine-tuned on IMDb sentiment classification dataset for binary classification (positive/negative). This model serves as the high-capacity reference for training compact student models using knowledge distillation.
π§ Model Architecture
- Base Model:
bert-base-uncased - Fine-tuning Task: Sentiment Classification (IMDb dataset)
- Layers: 12
- Hidden Size: 768
- Attention Heads: 12
- Total Parameters: ~110M
- File Size: ~418MB
π Performance
- Dataset: IMDb (50,000 movie reviews)
- Accuracy: 92.40%
- F1 Score: 92.44%
π Inference Example
from transformers import BertForSequenceClassification, BertTokenizer
tokenizer = BertTokenizer.from_pretrained("sh7vashrestha/BertBaseUncased-SenetimentAnalysis")
model = BertForSequenceClassification.from_pretrained("sh7vashrestha/BertBaseUncased-SenetimentAnalysis")
inputs = tokenizer("The movie was absolutely wonderful!", return_tensors="pt")
outputs = model(**inputs)
prediction = outputs.logits.argmax(dim=1)
print(prediction)
label_mapping = {0: "negative", 1: "positive"}
prediction_label = label_mapping[prediction.item()]
print(prediction_label)
- Downloads last month
- 3