File size: 7,978 Bytes
fafc58d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5452ae4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
203a743
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
---
language: en
tags:
- sentiment-analysis
- it-support
- help-desk
- customer-feedback
- roberta
- regression
license: cc-by-4.0
datasets:
- custom
metrics:
- accuracy
- mse
- mae
pipeline_tag: text-classification
---

# IT Help Desk Agent Sentiment Analysis

## Model Description

This is a fine-tuned RoBERTa model specifically designed for sentiment analysis of IT help desk and technical support interactions. The model performs **regression-based sentiment analysis**, outputting continuous sentiment scores from -1 (very negative) to +1 (very positive), which are then classified into three categories:

- **Negative** (score ≤ -0.33): Dissatisfied customers, complaints, frustrations
- **Neutral** (score between -0.33 and 0.33): Informational requests, neutral feedback
- **Positive** (score ≥ 0.33): Satisfied customers, praise, positive experiences

## Intended Use

This model is specifically optimized for analyzing:
- IT support ticket feedback
- Help desk interaction sentiment
- Technical support chat transcripts
- Customer service feedback in IT contexts
- Agent performance evaluation through customer sentiment

## Model Architecture

- **Base Model**: RoBERTa-base
- **Architecture**: Custom regression head with dropout layers
- **Output**: Continuous sentiment score (-1 to +1) with discrete classification
- **Max Sequence Length**: 512 tokens

## Usage

```python
# Simple usage example for KameronB/help-desk-agent-sentiment
import torch
import torch.nn as nn
from transformers import RobertaTokenizer, RobertaModel
from huggingface_hub import hf_hub_download

# Define the model architecture
class RobertaForSentimentRegression(nn.Module):
    def __init__(self, model_name='roberta-base', dropout_rate=0.3):
        super(RobertaForSentimentRegression, self).__init__()
        self.roberta = RobertaModel.from_pretrained(model_name)
        self.dropout = nn.Dropout(dropout_rate)
        self.regression_head = nn.Sequential(
            nn.Linear(768, 256), nn.ReLU(), nn.Dropout(dropout_rate),
            nn.Linear(256, 64), nn.ReLU(), nn.Dropout(dropout_rate),
            nn.Linear(64, 1), nn.Tanh()
        )
        
    def forward(self, input_ids, attention_mask):
        outputs = self.roberta(input_ids=input_ids, attention_mask=attention_mask)
        pooled_output = self.dropout(outputs.last_hidden_state[:, 0, :])
        return self.regression_head(pooled_output).squeeze(-1)


# Load model and tokenizer from Hugging Face
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
tokenizer = RobertaTokenizer.from_pretrained('KameronB/help-desk-agent-sentiment')
model = RobertaForSentimentRegression()

# Download and load model weights
model_path = hf_hub_download(repo_id="KameronB/help-desk-agent-sentiment", 
                            filename="roberta_sentiment_regression.pth")
model.load_state_dict(torch.load(model_path, map_location=device))
model.to(device).eval()

# Prediction function
def predict_sentiment(text):
    encoding = tokenizer(
        text, 
        truncation=True, 
        padding='max_length', 
        max_length=512, 
        return_tensors='pt'
    )

    with torch.no_grad():
        score = model(
            encoding['input_ids'].to(device), 
            encoding['attention_mask'].to(device)
        ).item()
    
    if score <= -0.33:
        return score, 'Negative'
    elif score >= 0.33:
        return score, 'Positive'
    else:
        return score, 'Neutral'

# Example usage
examples = [
    "The technician was amazing and fixed my issue quickly!",
    "I waited hours and the agent was unhelpful.",
    "I submitted a ticket yesterday about password reset.",
    "The technician was absolutely amazing and fixed my issue in minutes!",
    "I waited on hold for 2 hours and the agent was completely unhelpful.",
    "I called the help desk at 3 PM about my password reset.",
    "The support team is the worst I've ever dealt with - completely incompetent!",
    "Sarah was very professional and walked me through the solution step by step.",
    "My ticket was submitted yesterday and I received confirmation.",
    "This is the most frustrating experience I've ever had with tech support!",
    "The agent was okay, nothing special but got the job done.",
    "Mike is an absolute legend! Best tech support ever!",
    "I'm not sure if the issue is completely resolved yet.",
    "It is a bit rainy  outside today"
]

for text in examples:
    score, sentiment = predict_sentiment(text)
    print(f"Text: {text}")
    print(f"Sentiment: {sentiment} (Score: {score:.3f})")

```


To get the sentiment of a piece of text, use the function below. It gets the sentiment of each sentence and then returns a length-weighted average to get the final sentiment

```python
# Install and import NLTK for sentence tokenization
import nltk
nltk.download('punkt', quiet=True)
from nltk.tokenize import sent_tokenize

def predict_paragraph_sentiment(text):
    """
    Predict sentiment for a paragraph by analyzing individual sentences
    and calculating a weighted average based on sentence length.
    
    Args:
        text (str): The paragraph text to analyze
        
    Returns:
        tuple: (weighted_score, overall_sentiment, sentence_details)
    """
    # Break text into sentences using NLTK
    sentences = sent_tokenize(text)
    
    if not sentences:
        return 0.0, 'Neutral', []
    
    # Analyze each sentence
    sentence_results = []
    total_chars = len(text)
    
    for sentence in sentences:
        sentence = sentence.strip()
        if sentence:  # Skip empty sentences
            score, sentiment = predict_sentence_sentiment(sentence)
            char_weight = len(sentence) / total_chars
            
            sentence_results.append({
                'sentence': sentence,
                'score': score,
                'sentiment': sentiment,
                'length': len(sentence),
                'weight': char_weight
            })
    
    # Calculate weighted average
    if not sentence_results:
        return 0.0, 'Neutral', []
    
    weighted_score = sum(result['score'] * result['weight'] for result in sentence_results)
    
    # Determine overall sentiment
    if weighted_score <= -0.33:
        overall_sentiment = 'Negative'
    elif weighted_score >= 0.33:
        overall_sentiment = 'Positive'
    else:
        overall_sentiment = 'Neutral'
    
    return weighted_score, overall_sentiment, sentence_results

# Test the paragraph sentiment function
test_paragraph = """
The IT support experience was mixed today. The initial wait time was absolutely terrible - I was on hold for over an hour! 
However, once I got through to Mike, he was fantastic. He quickly diagnosed the issue with my VPN connection and walked me 
through the solution step by step. The whole resolution took about 15 minutes once we started working on it. 
While the wait was frustrating, I'm satisfied with the technical support I received.
"""

print("=== Paragraph Sentiment Analysis Example ===")
print(f"Text: {test_paragraph.strip()}")
print("\n" + "="*80)

weighted_score, overall_sentiment, sentence_details = predict_paragraph_sentiment(test_paragraph)

print(f"\nOVERALL RESULTS:")
print(f"Weighted Score: {weighted_score:.3f}")
print(f"Overall Sentiment: {overall_sentiment}")

print(f"\nSENTENCE BREAKDOWN:")
for i, detail in enumerate(sentence_details, 1):
    print(f"{i}. \"{detail['sentence']}\"")
    print(f"   Score: {detail['score']:.3f} | Sentiment: {detail['sentiment']} | Weight: {detail['weight']:.3f}")
    print()

print("="*80)

```

If you want to quantize this model to save a lot of memory, you can use torchao.
This is the config you would use if you wanted to run it on a laptop or small device

```python
from torchao.quantization import quantize_, Int8WeightOnlyConfig

model.eval().to("cpu")

# In-place: converts Linear layers to int8 weights
quantize_(model, Int8WeightOnlyConfig())
```