IDA-SERICS
/

PromptClassifier

Text Classification

databricks-dolly

prompt-category

Model card Files Files and versions

mariadg commited on Apr 16, 2025

Commit

30bc546

·

verified ·

1 Parent(s): 1c31d66

Upload README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+license: apache-2.0
+tags:
+- bert
+- deberta
+- text-classification
+- fine-tuned
+- databricks-dolly
+- prompt-category
+language: en
+---
+# 🧠 DeBERTa-v3 Base - Prompt Category Classifier (Fine-tuned)
+This model is a fine-tuned version of [`microsoft/deberta-v3-base`](https://huggingface.co/microsoft/deberta-v3-base) on a modified version of the [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) dataset.
+It has been trained to classify the **prompt category** based solely on the **response** text.
+## 🗂️ Task
+**Text Classification**
+**Input**: Response text from a human-annotated prompt
+**Output**: One of the predefined categories such as:
+- `brainstorming`
+- `classification`
+- `closed_qa`
+- `creative_writing`
+- `general_qa`
+- `information_extraction`
+- `open_qa`
+- `summarization`
+## 📊 Evaluation
+The model was evaluated on a balanced version of the dataset. Here are the results:
+- **Validation Accuracy**: ~85.5%
+- **F1 Score**: ~85.0%
+- Best performance on: `creative_writing`, `classification`, `summarization`
+- Room for improvement on: `open_qa`
+## 🧪 How to Use
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model = AutoModelForSequenceClassification.from_pretrained("mariadg/deberta-v3-category-classifier")
+tokenizer = AutoTokenizer.from_pretrained("mariadg/deberta-v3-category-classifier")
+text = "The mitochondria is known as the powerhouse of the cell."
+inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
+outputs = model(**inputs)
+pred = torch.argmax(outputs.logits, dim=1).item()
+print(pred)  # Map this index back to label if needed
+```
+## 🛠️ Training Details
+- **Base model**: `microsoft/deberta-v3-base`
+- **Framework**: PyTorch
+- **Max length**: 256
+- **Batch size**: 16
+- **Epochs**: 4
+- **Loss function**: `CrossEntropyLoss`
+## 🔐 License
+Apache 2.0
+---
+📝 Fine-tuned by [mariadg](https://huggingface.co/mariadg) – for research purposes.