ScottBiggs2 commited on
Commit
1f89c66
·
verified ·
1 Parent(s): ec2a737

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -9,13 +9,49 @@ tags:
9
 
10
  # thrad-distilbert-conversation-classifier
11
 
12
- DistilBERT model for conversation classification with hard labels
13
 
14
  ## Model Details
15
 
16
  - **Base Architecture**: DistilBERT
17
  - **Task**: Multi-class conversation intent classification
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Usage
20
 
21
  ```python
 
9
 
10
  # thrad-distilbert-conversation-classifier
11
 
12
+ DistilBERT model for conversation classification with hard labels. For additional information about the data preprocessing, training, or evaluation, refer to the public repository for the project [[here](https://github.com/Thrads/Conversation-Classifiers)].
13
 
14
  ## Model Details
15
 
16
  - **Base Architecture**: DistilBERT
17
  - **Task**: Multi-class conversation intent classification
18
 
19
+ ## Performance:
20
+
21
+ In comparison against LLMs on the held-out test split of verified data, our Distil BERT proves superior in aggregate and classwise comparisons. API costs are calculated from the groq public rates on 8 Nov 2025.
22
+ ```bash
23
+ Distil. BERT Results vs LLMs (N = 2224)
24
+ =========================================================================================
25
+ Model Accuracy Cross-Cat Err Banned→Safe Cost
26
+ =========================================================================================
27
+ PyTorch (safetensors) 83.77% 5.17% 67 $ 0.0000
28
+ Llama 3.1 8B (Groq) 40.65% 14.43% 289 $ 0.1677
29
+ GPT OSS 20B (Groq) 35.03% 17.67% 387 $ 0.2535
30
+ GPT OSS 120B (Groq) 60.52% 10.79% 231 $ 0.5071
31
+ =========================================================================================
32
+ ```
33
+ `thrad-distilbert-conversation-classifier` dominates LLMs more than 200X its size by aggregate and classwise accuracy as well as cross-category error rate.
34
+
35
+
36
+ ## Classes:
37
+
38
+ ```bash
39
+ A - academic_help – Students getting help with homework, assignments, tests, or studying.
40
+ B - personal_writing_or_communication – Draft, edit, or improve personal/professional emails, messages, social media posts, letters, or workplace communications.
41
+ C - writing_and_editing – Create, edit, or improve nonfiction or instructional writing.
42
+ D - creative_writing_and_role_play – Create poems, stories, fictional narratives, scripts, dialogues, or character-based roleplays.
43
+ E - general_guidance_and_info – Provide step-by-step guidance, practical advice, or factual information.
44
+ F - programming_and_data_analysis – Write or debug code or work with data/programming tools.
45
+ G - creative_ideation – Generate new ideas, brainstorm concepts, or discover new topics.
46
+ H - purchasable_products – Ask about products, services, or prices.
47
+ I - greetings_and_chitchat – Small talk or casual chat.
48
+ J - relationships_and_personal_reflection – Discuss emotions, relationships, or introspection.
49
+ K - media_generation_or_analysis – Create, edit, analyze, or retrieve visual/audio/media content.
50
+ L - other – if there is no indication of what the user wants or if there is an intent that is not listed above.
51
+ M - other_obscene_or_illegal - if the user is making obscene or illegal requests.
52
+ ```
53
+
54
+
55
  ## Usage
56
 
57
  ```python