File size: 6,942 Bytes
1f04272
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
 
 
1f04272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1e9fb52
 
cd8c9b4
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
 
 
cd8c9b4
 
 
 
 
1f04272
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
1e9fb52
cd8c9b4
 
 
 
 
1e9fb52
cd8c9b4
 
 
1f04272
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
1e9fb52
cd8c9b4
 
 
 
 
1e9fb52
1f04272
 
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1f04272
 
 
 
 
 
 
 
 
 
 
 
cd8c9b4
1f04272
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
 
 
 
 
cd8c9b4
 
1e9fb52
 
 
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
1f04272
 
 
 
1e9fb52
 
 
 
 
1f04272
 
1e9fb52
 
 
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
1e9fb52
cd8c9b4
1e9fb52
 
 
1f04272
 
 
 
 
cd8c9b4
1f04272
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
---
library_name: transformers
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- tone-detection
- text-classification
- nlp
- transformers
- production
base_model: distilbert-base-uncased
datasets:
- custom
metrics:
- accuracy
- f1
widget:
- text: "Can you explain this again?"
  example_title: "Questioning"
- text: "I strongly disagree with this decision."
  example_title: "Assertive"
- text: "This is absolutely terrible!"
  example_title: "Frustrated"
- text: "Great job on the presentation!"
  example_title: "Enthusiastic"
- text: "Here are the key findings from the report."
  example_title: "Informational"
---

# Tone Baseline v3

## Model Summary

**Tone Baseline v3** is a lightweight English text classification model designed to detect the **communicative tone** of short-form text.

The model predicts a **single dominant tone**, along with a confidence score and probability distribution across all supported tone categories. It is optimized for **real-world production use**, including writing assistants, browser extensions, and backend APIs.

---

## Model Details

### Model Description

- **Developed by:** Lokesh P  
- **Model type:** Multi-class text classification (tone detection)  
- **Language(s):** English  
- **License:** Apache 2.0  
- **Framework:** Hugging Face Transformers (PyTorch)  
- **Base Model:** DistilBERT (distilbert-base-uncased)

The model is intended to be used as a **pre-processing or analysis component** in applications that need to understand how a piece of text is phrased (e.g., polite vs rude, questioning vs informational), rather than what the text is about.

---

### Supported Tone Labels

The model predicts one of the following tone labels:

- `supportive`
- `enthusiastic`
- `frustrated`
- `rude`
- `informational`
- `questioning`
- `formal`
- `assertive`

---

## Uses

### Direct Use

This model can be used directly for:

- Tone detection in messages, emails, or chat inputs
- UX feedback on how a message may be perceived
- Pre-routing text into different rewrite or moderation pipelines
- Writing assistance tools

Example direct usage:

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

text = "I strongly disagree with this decision."
result = classifier(text)
print(result)
```

---

### Downstream Use

The model is commonly used as part of a larger system, for example:

- As an **input signal** for text rewriting systems
- As a **decision layer** before invoking a generative model
- As part of browser extensions or API services
- As a lightweight moderation or feedback component

---

### Out-of-Scope Use

This model **should NOT** be used for:

- Psychological or mental health diagnosis
- Personality inference
- Detecting intent, deception, or truthfulness
- Legal, medical, or safety-critical decision making
- Surveillance or profiling of individuals

The model classifies **text tone only**, not user intent or emotional state.

---

## Bias, Risks, and Limitations

### Known Limitations

- English-only
- Performs best on short to medium-length text
- May misclassify sarcasm or highly contextual statements
- Sensitive to ambiguous phrasing
- Not designed for long documents or multi-paragraph inputs

### Bias Considerations

The model reflects patterns present in its training data and may encode biases related to tone interpretation. Outputs should be treated as **assistive signals**, not absolute judgments.

---

### Recommendations

- Use the model as **one signal among many**, not a final authority
- Avoid high-stakes automated decisions based solely on model output
- Perform task-specific evaluation before deployment in sensitive domains

---

## How to Get Started

### Using Transformers

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

result = classifier("Can you explain this again?")
print(result)
```

### Example Output

```json
[
  {
    "label": "questioning",
    "score": 0.9992
  },
  {
    "label": "supportive",
    "score": 0.0002
  },
  {
    "label": "informational",
    "score": 0.0001
  }
]
```

---

## Training Details

### Training Data

The model was trained on a curated dataset of English text annotated for **communicative tone**.

**Data characteristics:**

- Short-form written English
- Conversational and instructional text
- Neutral, emotional, and directive language
- No personally identifiable information (PII) intentionally included

> Exact dataset sources are not publicly released.

---

### Training Procedure

- Tokenization using a transformer-compatible tokenizer
- Supervised fine-tuning for multi-class classification
- Softmax output layer over tone labels

#### Training Hyperparameters

- **Max sequence length:** 128 tokens
- **Training regime:** fp32
- **Optimizer:** AdamW (standard configuration)
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Epochs:** 3-5

---

## Evaluation

### Evaluation Approach

The model was evaluated using:

- Held-out validation data
- Manual qualitative testing
- Real-world usage in API and browser-extension workflows

### Observed Strengths

- High accuracy on short queries and statements
- Strong differentiation between questioning vs informational tone
- Stable confidence distributions
- Low-latency inference on CPU

### Metrics

- **Accuracy:** ~92% on validation set
- **F1 Score (macro):** ~0.90

---

## Environmental Impact

The model was trained using standard GPU infrastructure. Exact carbon emissions were not formally measured.

- **Hardware:** GPU (cloud-based)
- **Cloud Provider:** Not disclosed
- **Compute Region:** Not disclosed
- **Training Time:** Approximately 2-4 hours

---

## Technical Specifications

### Model Architecture and Objective

- Transformer-based encoder (DistilBERT)
- Multi-class classification objective
- Softmax probability distribution over tone labels
- 8 output classes

### Compute Infrastructure

#### Hardware

- GPU for training
- CPU-friendly inference

#### Software

- Python 3.8+
- PyTorch 2.0+
- Hugging Face Transformers 4.30+

---

## Citation

If you use this model in your work, attribution is appreciated.

**BibTeX:**

```bibtex
@misc{tonebaselinev3,
  author = {Lokesh P},
  title = {Tone Baseline v3},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LokeshDevCreates/tone-baseline-v3}}
}
```

---

## Model Card Authors

- Lokesh P

---

## Model Card Contact

For questions or issues, please open an issue on the Hugging Face model repository or contact via the Hugging Face platform.

---

## Acknowledgments

This model was developed to support tone-aware text processing in production applications. Thanks to the Hugging Face community for providing excellent tools and infrastructure.