File size: 5,126 Bytes
b52a289
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f1b8e3e
b52a289
 
b7dfdff
b52a289
 
 
 
b7dfdff
b52a289
f1b8e3e
 
 
 
 
 
 
 
 
 
 
 
 
b52a289
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
language: en
license: mit
tags:
- text-classification
- gpt2
- transformers
- pytorch
- custom-architecture
- tiktoken
library_name: transformers
---

# TicketClassificationGPT

## Model Summary

**TicketClassificationGPT** is a GPT-2โ€“based text classification model designed entirely from scratct to classify IT support tickets into 8 predefined categories.  
The model uses the original OpenAI GPT-2 architecture and weights, with the language modeling head replaced by a custom classification head. Only the final layers were fine-tuned for the ticket classification task.

This model is fully compatible with the Hugging Face `transformers` ecosystem and can be loaded using `AutoModel.from_pretrained`.

---

## How to Get Started with the Model

### Inference Example (Transformers + tiktoken)

```python
from transformers import AutoModel
import tiktoken

# Load tokenizer
tokenizer = tiktoken.get_encoding("gpt2")

# Load model
model_id = "FarhanAK128/TicketClassificationGPT"
model = AutoModel.from_pretrained(
    model_id,
    trust_remote_code=True
)

# Example prediction
text = "Need extra space on Google Drive."
prediction = model.predict(text, tokenizer)

print("Predicted class:", prediction) # Predicted class: Storage
```

**Note:** This model uses a custom `.predict()` method defined in the repository and requires `trust_remote_code=True` to function.

---

## Model Details

### ๐Ÿ“ Model Description

- **Developed by:** Farhan Ali Khan  
- **Model type:** GPT-2โ€“like text classification model    
- **Framework:** PyTorch  
- **Task:** Text Classification  
- **Number of classes:** 8  
- **Language:** English  
- **License:** MIT   

### ๐Ÿ“‹ Classification Labels

| Class ID | Category                 |
|----------|--------------------------|
| 0        | Hardware                 |
| 1        | HR Support               |
| 2        | Access                   |
| 3        | Miscellaneous            |
| 4        | Storage                  |
| 5        | Purchase                 |
| 6        | Internal Project         |
| 7        | Administrative Rights    |


### Model Sources

- **Repository:** https://huggingface.co/FarhanAK128/TicketClassificationGPT  
- **Base model:** OpenAI GPT-2 like architecture from scratch
- **Paper:** https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf  

---

## Training Details
### Training Data

The model was trained on the **IT Service Ticket Classification Dataset** available on Kaggle.

- **Dataset name:** IT Service Ticket Classification Dataset  
- **Source:** Kaggle  
- **Link:** https://www.kaggle.com/datasets/adisongoh/it-service-ticket-classification-dataset  
- **Content:** Labeled IT support ticket text data  
- **Language:** English  

The dataset was used for supervised multi-class classification after standard text preprocessing and tokenization.


### Training Procedure

- **Base weights:** OpenAI GPT-2
- **Fine-tuning strategy:** Partial fine-tuning (classification head + final transformer layers)
- **Optimizer:** AdamW
- **Learning rate:** 1e-4
- **Weight decay:** 0.1
- **Epochs:** 5
- **Random seed:** 123
- **Loss function:** Cross-Entropy Loss
- **Training regime:** FP32
- **Evaluation frequency:** Every 30 steps
- **Total training time:** ~140 minutes
- **Final training loss:** ~0.61
- **Final validation loss:** ~0.86

### ๐Ÿ“ˆ Training Progress
#### Training and Validation Loss
![Training and Validation Loss](https://cdn-uploads.huggingface.co/production/uploads/65bc1af7ce846f8aa908a978/IDm_z1ud6mP35T_8eMHNl.png)


#### Training and Validation Accuracy
![Training and Validation Accuracy](https://cdn-uploads.huggingface.co/production/uploads/65bc1af7ce846f8aa908a978/WwprMh8Ohj3fy0tjnb0Sd.png)

 ### ๐Ÿ“Š Model Performance

| Dataset Split | Accuracy |
|--------------|----------|
| ๐Ÿ‹๏ธ Training   | **76.54%** |
| ๐Ÿงช Validation | **75.67%** |
| ๐Ÿง  Test       | **73.83%** |

---

## Uses

### Direct Use

This model can be used directly to classify short IT support ticket texts into predefined categories.

Example use cases:
- Automated ticket routing
- Helpdesk prioritization
- Internal IT workflow automation

### Downstream Use

The model may be further fine-tuned on:
- Organization-specific ticket data
- Expanded label sets
- Domain-specific terminology

### Out-of-Scope Use

- Multilingual text classification  
- Open-domain topic classification  
- Legal, medical, or safety-critical decision-making  

---

## Bias, Risks, and Limitations

- Trained on a limited-domain dataset (IT support tickets)
- Not evaluated for demographic or social bias
- Predictions may be unreliable for unseen ticket categories
- Performance depends on input text quality and length

### Recommendations

Human validation is recommended before using predictions in production systems.  
For best results, further fine-tuning on in-domain data is advised.

---

## Model Card Authors
**Farhan Ali Khan**

## Model Card Contact
For questions or feedback, please reach out via my Hugging Face profile:
[FarhanAK128](https://huggingface.co/FarhanAK128)