Update README.md
Browse files
README.md
CHANGED
|
@@ -9,13 +9,49 @@ tags:
|
|
| 9 |
|
| 10 |
# thrad-distilbert-conversation-classifier
|
| 11 |
|
| 12 |
-
DistilBERT model for conversation classification with hard labels
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
| 16 |
- **Base Architecture**: DistilBERT
|
| 17 |
- **Task**: Multi-class conversation intent classification
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
## Usage
|
| 20 |
|
| 21 |
```python
|
|
|
|
| 9 |
|
| 10 |
# thrad-distilbert-conversation-classifier
|
| 11 |
|
| 12 |
+
DistilBERT model for conversation classification with hard labels. For additional information about the data preprocessing, training, or evaluation, refer to the public repository for the project [[here](https://github.com/Thrads/Conversation-Classifiers)].
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
| 16 |
- **Base Architecture**: DistilBERT
|
| 17 |
- **Task**: Multi-class conversation intent classification
|
| 18 |
|
| 19 |
+
## Performance:
|
| 20 |
+
|
| 21 |
+
In comparison against LLMs on the held-out test split of verified data, our Distil BERT proves superior in aggregate and classwise comparisons. API costs are calculated from the groq public rates on 8 Nov 2025.
|
| 22 |
+
```bash
|
| 23 |
+
Distil. BERT Results vs LLMs (N = 2224)
|
| 24 |
+
=========================================================================================
|
| 25 |
+
Model Accuracy Cross-Cat Err Banned→Safe Cost
|
| 26 |
+
=========================================================================================
|
| 27 |
+
PyTorch (safetensors) 83.77% 5.17% 67 $ 0.0000
|
| 28 |
+
Llama 3.1 8B (Groq) 40.65% 14.43% 289 $ 0.1677
|
| 29 |
+
GPT OSS 20B (Groq) 35.03% 17.67% 387 $ 0.2535
|
| 30 |
+
GPT OSS 120B (Groq) 60.52% 10.79% 231 $ 0.5071
|
| 31 |
+
=========================================================================================
|
| 32 |
+
```
|
| 33 |
+
`thrad-distilbert-conversation-classifier` dominates LLMs more than 200X its size by aggregate and classwise accuracy as well as cross-category error rate.
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
## Classes:
|
| 37 |
+
|
| 38 |
+
```bash
|
| 39 |
+
A - academic_help – Students getting help with homework, assignments, tests, or studying.
|
| 40 |
+
B - personal_writing_or_communication – Draft, edit, or improve personal/professional emails, messages, social media posts, letters, or workplace communications.
|
| 41 |
+
C - writing_and_editing – Create, edit, or improve nonfiction or instructional writing.
|
| 42 |
+
D - creative_writing_and_role_play – Create poems, stories, fictional narratives, scripts, dialogues, or character-based roleplays.
|
| 43 |
+
E - general_guidance_and_info – Provide step-by-step guidance, practical advice, or factual information.
|
| 44 |
+
F - programming_and_data_analysis – Write or debug code or work with data/programming tools.
|
| 45 |
+
G - creative_ideation – Generate new ideas, brainstorm concepts, or discover new topics.
|
| 46 |
+
H - purchasable_products – Ask about products, services, or prices.
|
| 47 |
+
I - greetings_and_chitchat – Small talk or casual chat.
|
| 48 |
+
J - relationships_and_personal_reflection – Discuss emotions, relationships, or introspection.
|
| 49 |
+
K - media_generation_or_analysis – Create, edit, analyze, or retrieve visual/audio/media content.
|
| 50 |
+
L - other – if there is no indication of what the user wants or if there is an intent that is not listed above.
|
| 51 |
+
M - other_obscene_or_illegal - if the user is making obscene or illegal requests.
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
|
| 55 |
## Usage
|
| 56 |
|
| 57 |
```python
|