Thrad
/

thrad-distilbert-conversation-classifier

@@ -9,13 +9,49 @@ tags:
 # thrad-distilbert-conversation-classifier
-DistilBERT model for conversation classification with hard labels
 ## Model Details
 - **Base Architecture**: DistilBERT
 - **Task**: Multi-class conversation intent classification
 ## Usage
 ```python

 # thrad-distilbert-conversation-classifier
+DistilBERT model for conversation classification with hard labels. For additional information about the data preprocessing, training, or evaluation, refer to the public repository for the project [[here](https://github.com/Thrads/Conversation-Classifiers)].
 ## Model Details
 - **Base Architecture**: DistilBERT
 - **Task**: Multi-class conversation intent classification
+## Performance:
+In comparison against LLMs on the held-out test split of verified data, our Distil BERT proves superior in aggregate and classwise comparisons. API costs are calculated from the groq public rates on 8 Nov 2025.
+```bash
+Distil. BERT Results vs LLMs (N = 2224)
+=========================================================================================
+Model                        Accuracy     Cross-Cat Err   Banned→Safe       Cost
+=========================================================================================
+PyTorch (safetensors)          83.77%           5.17%          67         $  0.0000
+Llama 3.1 8B (Groq)            40.65%          14.43%         289         $  0.1677
+GPT OSS 20B (Groq)             35.03%          17.67%         387         $  0.2535
+GPT OSS 120B (Groq)            60.52%          10.79%         231         $  0.5071
+=========================================================================================
+```
+`thrad-distilbert-conversation-classifier` dominates LLMs more than 200X its size by aggregate and  classwise accuracy as well as cross-category error rate.
+## Classes:
+```bash
+A - academic_help – Students getting help with homework, assignments, tests, or studying.
+B - personal_writing_or_communication – Draft, edit, or improve personal/professional emails, messages, social media posts, letters, or workplace communications.
+C - writing_and_editing – Create, edit, or improve nonfiction or instructional writing.
+D - creative_writing_and_role_play – Create poems, stories, fictional narratives, scripts, dialogues, or character-based roleplays.
+E - general_guidance_and_info – Provide step-by-step guidance, practical advice, or factual information.
+F - programming_and_data_analysis – Write or debug code or work with data/programming tools.
+G - creative_ideation – Generate new ideas, brainstorm concepts, or discover new topics.
+H - purchasable_products – Ask about products, services, or prices.
+I - greetings_and_chitchat – Small talk or casual chat.
+J - relationships_and_personal_reflection – Discuss emotions, relationships, or introspection.
+K - media_generation_or_analysis – Create, edit, analyze, or retrieve visual/audio/media content.
+L - other – if there is no indication of what the user wants or if there is an intent that is not listed above.
+M - other_obscene_or_illegal - if the user is making obscene or illegal requests.
+```
 ## Usage
 ```python