| --- |
| library_name: transformers |
| license: apache-2.0 |
| base_model: distilbert-base-uncased |
| tags: |
| - generated_from_trainer |
| model-index: |
| - name: help_classifier |
| results: [] |
| datasets: |
| - King-8/help-request-messages |
| --- |
| |
| # help_classifier |
| |
| This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the "King-8/help-request-messages" dataset. |
| It achieves the following results on the evaluation set: |
| - Loss: 1.3083 |
| |
| --- |
| |
| ## π€ CIC Help Classifier Model |
| |
| ### Overview |
| |
| This model is a fine-tuned text classification model designed to identify the type of help a user needs within the Coding in Color (CIC) ecosystem. |
| |
| It enables AI systems to understand user challenges and provide structured support. |
| |
| --- |
| |
| ### π§ Model Details |
| |
| * Base model: `distilbert-base-uncased` |
| * Task: Text classification |
| * Training data: CIC Help Classification Dataset |
| * Framework: Hugging Face Transformers |
| |
| --- |
| |
| ### π Labels |
| |
| * learning_help |
| * project_help |
| * attendance_issue |
| * technical_issue |
| * general_guidance |
|
|
| --- |
|
|
| ### βοΈ Training |
|
|
| * Epochs: 3 |
| * Dataset size: 100 samples |
| * Train/Validation/Test split used |
|
|
| --- |
|
|
| ### π Performance Notes |
|
|
| * Training and validation loss decreased across epochs |
| * Model performs well on common help scenarios |
| * Accuracy is limited due to small dataset size |
|
|
| --- |
|
|
| ### π§ͺ Example Usage |
|
|
| ```python |
| predict("I'm stuck on my project and don't know what to do") |
| ``` |
|
|
| Output: |
|
|
| ```json |
| { |
| "type": "project_help", |
| "confidence": 0.82 |
| } |
| ``` |
|
|
| --- |
|
|
| ### π Use Case |
|
|
| This model is designed to be integrated into: |
|
|
| * MCP server tools |
| * Slack-based support systems |
| * AI assistants for CIC students |
|
|
| --- |
|
|
| ### π Future Improvements |
|
|
| * Fine-tune on larger CIC dataset |
| * Add real-time feedback learning |
| * Integrate with response generation models |
| * Improve classification accuracy with more edge cases |
|
|
| --- |
|
|
| ## Training procedure |
|
|
| ### Training hyperparameters |
|
|
| The following hyperparameters were used during training: |
| - learning_rate: 2e-05 |
| - train_batch_size: 8 |
| - eval_batch_size: 8 |
| - seed: 42 |
| - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| - lr_scheduler_type: linear |
| - num_epochs: 3 |
| |
| --- |
| |
| ### Training results |
| |
| | Training Loss | Epoch | Step | Validation Loss | |
| |:-------------:|:-----:|:----:|:---------------:| |
| | 1.3887 | 1.0 | 9 | 1.4495 | |
| | 1.2613 | 2.0 | 18 | 1.3350 | |
| | 1.1704 | 3.0 | 27 | 1.3083 | |
| |
| --- |
| |
| ### Framework versions |
| |
| - Transformers 5.0.0 |
| - Pytorch 2.10.0+cpu |
| - Datasets 4.0.0 |
| - Tokenizers 0.22.2 |