Update README.md

2157b7a verified about 2 months ago

5.24 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: distilbert-base-uncased
	tags:
	- generated_from_trainer
	model-index:
	- name: help-classifier-v2
	results: []
	datasets:
	- King-8/help-request-messages-v2
	---
	# 🤖 Help Classifier Model (v2)

	## 🧠 Overview

	The Help Classifier Model (v2) is a fine-tuned NLP model designed to classify student help requests into meaningful categories within a collaborative learning environment.

	This model is part of a larger AI system built for the Coding in Color (CIC) ecosystem, supporting students working across domains such as AI development, game development, 2D/3D art, and robotics.

	Its primary purpose is to:

	* Interpret real student messages
	* Identify intent behind help requests
	* Route inputs to appropriate downstream systems (e.g., generators, agents)

	---

	## 🚀 Version Update (v1 → v2)

	### 🔹 v1

	* Trained on ~100 examples
	* Limited generalization
	* Struggled with messy or informal input

	### 🔹 v2 (Current)

	* Trained on 1,000 examples
	* Balanced dataset across all categories
	* Strong performance on:

	* informal/slang input
	* mixed tone messages
	* ambiguous phrasing
	* real CIC-style check-ins

	👉 v2 significantly improves accuracy, stability, and real-world usability

	---

	## 🧩 Task Definition

	Task Type: Text Classification

	Input: Student message
	Output: One of 5 help categories

	---

	## 🏷️ Labels

	\| Label \| Description \|
	\| ------------------ \| --------------------------------------------------- \|
	\| `learning_help` \| User is trying to understand a concept or skill \|
	\| `project_help` \| User needs direction or next steps in a project \|
	\| `technical_issue` \| Something is broken or not working \|
	\| `attendance_issue` \| User missed a meeting or needs to catch up \|
	\| `general_guidance` \| User expresses uncertainty, stress, or needs advice \|

	---

	## 🏗️ Model Architecture

	* Base Model: distilbert-base-uncased
	* Fine-tuned for sequence classification
	* Number of labels: 5

	---

	## ⚙️ Training Configuration

	* Epochs: 4
	* Learning Rate: 2e-5
	* Batch Size: 8
	* Weight Decay: 0.01
	* Train/Validation Split: 80/10/10

	---

	## 📊 Training Results

	\| Epoch \| Training Loss \| Validation Loss \|
	\| ----- \| ------------- \| --------------- \|
	\| 1 \| 0.552 \| 0.512 \|
	\| 2 \| 0.111 \| 0.122 \|
	\| 3 \| 0.032 \| 0.077 \|
	\| 4 \| 0.025 \| 0.064 \|

	---

	## 📈 Performance Summary

	* Low validation loss (~0.06)
	* Strong generalization across unseen inputs
	* Stable convergence during training
	* Handles:

	* messy/slang text
	* indirect requests
	* multi-layered inputs

	---

	## 🧪 Example Predictions

	Input:

	```
	i missed the meeting and now idk what we’re doing
	```

	Output:

	```
	attendance_issue
	```

	---

	Input:

	```
	my model works but the predictions are weird and I don’t know why
	```

	Output:

	```
	technical_issue
	```

	---

	Input:

	```
	I feel like I’m behind and don’t know what to focus on
	```

	Output:

	```
	general_guidance
	```

	---

	## 🔗 System Integration

	This model is integrated into an MCP (Model Context Protocol) system where it acts as:

	> Entry-point classifier for routing student inputs

	Pipeline example:

	```
	User Input → Help Classifier → (Future: Generator / Summarizer)
	```

	---

	## 🎯 Use Cases

	* Help request classification
	* Slack/Discord message routing
	* Educational AI assistants
	* CIC ecosystem tools
	* AI agent pipelines

	---

	## ⚠️ Limitations

	* Single-label classification (some messages may contain multiple intents)
	* Edge cases may still overlap between categories
	* Domain-specific (focused on student tech environments)

	---

	## 🔮 Future Improvements

	* Multi-label classification
	* Larger dataset (2,000+ examples)
	* Confidence scoring
	* Integration with response generation models
	* Continuous retraining with real user data

	---

	## 👤 Author

	Created by Kingston Lewis as part of the Coding in Color program for the AI Dev team.

	---

	# help-classifier-v2

	This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the King-8/help-request-messages-v2 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0643


	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 4
	- mixed_precision_training: Native AMP


	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 0.5524 \| 1.0 \| 88 \| 0.5124 \|
	\| 0.1114 \| 2.0 \| 176 \| 0.1221 \|
	\| 0.0324 \| 3.0 \| 264 \| 0.0771 \|
	\| 0.0249 \| 4.0 \| 352 \| 0.0643 \|


	### Framework versions

	- Transformers 5.0.0
	- Pytorch 2.10.0+cpu
	- Datasets 4.0.0
	- Tokenizers 0.22.2