| # Sentinel-D DistilBERT Intent Classifier v1 |
|
|
| ## Model Details |
| - **Base Model**: distilbert-base-uncased |
| - **Task**: Sequence Classification (4 classes) |
| - **Training Date**: 2026-03-04T03:36:30.927788 |
| - **Classes**: VERSION_PIN, API_MIGRATION, MONKEY_PATCH, FULL_REFACTOR |
|
|
| ## Intent Classes |
| 1. **VERSION_PIN**: Pinning or locking dependency versions |
| 2. **API_MIGRATION**: Migrating between API versions or protocols |
| 3. **MONKEY_PATCH**: Quick/temporary fixes via runtime patching |
| 4. **FULL_REFACTOR**: Complete rewriting or structural redesign |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import DistilBertTokenizer, DistilBertForSequenceClassification |
| import torch |
| import json |
| |
| tokenizer = DistilBertTokenizer.from_pretrained("./distilbert-intent-classifier-v1") |
| model = DistilBertForSequenceClassification.from_pretrained("./distilbert-intent-classifier-v1") |
| |
| text = "I updated my package.json to lock the Express version to 4.18.0" |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
| |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| logits = outputs.logits |
| predicted_class_id = logits.argmax(-1).item() |
| |
| # Map ID back to label |
| label_config = json.load(open("label_config.json")) |
| predicted_label = label_config["id_to_label"][str(predicted_class_id)] |
| print(f"Predicted Intent: {predicted_label}") |
| ``` |
|
|
| ## Files |
| - `pytorch_model.bin`: Fine-tuned model weights |
| - `config.json`: Model configuration |
| - `vocab.txt`: Tokenizer vocabulary |
| - `label_config.json`: Intent class mappings |
| - `README.md`: This file |
|
|
| ## Training Configuration |
| - Epochs: 6 |
| - Batch Size: 16 |
| - Learning Rate: Dynamic tuning from 1e-05 to 5e-05 |
| - Optimizer: AdamW with weighted cross-entropy loss |
| - Class Imbalance Handling: Random oversampling + weighted loss |
|
|
| ## Performance Targets |
| - Accuracy: >= 0.80 |
| - Macro F1: >= 0.80 |
|
|