---
library_name: transformers
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- tone-detection
- text-classification
- nlp
- transformers
- production
base_model: distilbert-base-uncased
datasets:
- custom
metrics:
- accuracy
- f1
widget:
- text: "Can you explain this again?"
  example_title: "Questioning"
- text: "I strongly disagree with this decision."
  example_title: "Assertive"
- text: "This is absolutely terrible!"
  example_title: "Frustrated"
- text: "Great job on the presentation!"
  example_title: "Enthusiastic"
- text: "Here are the key findings from the report."
  example_title: "Informational"
---

# Tone Baseline v3

## Model Summary

**Tone Baseline v3** is a lightweight English text classification model designed to detect the **communicative tone** of short-form text.

The model predicts a **single dominant tone**, along with a confidence score and probability distribution across all supported tone categories. It is optimized for **real-world production use**, including writing assistants, browser extensions, and backend APIs.

---

## Model Details

### Model Description

- **Developed by:** Lokesh P  
- **Model type:** Multi-class text classification (tone detection)  
- **Language(s):** English  
- **License:** Apache 2.0  
- **Framework:** Hugging Face Transformers (PyTorch)  
- **Base Model:** DistilBERT (distilbert-base-uncased)

The model is intended to be used as a **pre-processing or analysis component** in applications that need to understand how a piece of text is phrased (e.g., polite vs rude, questioning vs informational), rather than what the text is about.

---

### Supported Tone Labels

The model predicts one of the following tone labels:

- `supportive`
- `enthusiastic`
- `frustrated`
- `rude`
- `informational`
- `questioning`
- `formal`
- `assertive`

---

## Uses

### Direct Use

This model can be used directly for:

- Tone detection in messages, emails, or chat inputs
- UX feedback on how a message may be perceived
- Pre-routing text into different rewrite or moderation pipelines
- Writing assistance tools

Example direct usage:

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

text = "I strongly disagree with this decision."
result = classifier(text)
print(result)
```

---

### Downstream Use

The model is commonly used as part of a larger system, for example:

- As an **input signal** for text rewriting systems
- As a **decision layer** before invoking a generative model
- As part of browser extensions or API services
- As a lightweight moderation or feedback component

---

### Out-of-Scope Use

This model **should NOT** be used for:

- Psychological or mental health diagnosis
- Personality inference
- Detecting intent, deception, or truthfulness
- Legal, medical, or safety-critical decision making
- Surveillance or profiling of individuals

The model classifies **text tone only**, not user intent or emotional state.

---

## Bias, Risks, and Limitations

### Known Limitations

- English-only
- Performs best on short to medium-length text
- May misclassify sarcasm or highly contextual statements
- Sensitive to ambiguous phrasing
- Not designed for long documents or multi-paragraph inputs

### Bias Considerations

The model reflects patterns present in its training data and may encode biases related to tone interpretation. Outputs should be treated as **assistive signals**, not absolute judgments.

---

### Recommendations

- Use the model as **one signal among many**, not a final authority
- Avoid high-stakes automated decisions based solely on model output
- Perform task-specific evaluation before deployment in sensitive domains

---

## How to Get Started

### Using Transformers

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

result = classifier("Can you explain this again?")
print(result)
```

### Example Output

```json
[
  {
    "label": "questioning",
    "score": 0.9992
  },
  {
    "label": "supportive",
    "score": 0.0002
  },
  {
    "label": "informational",
    "score": 0.0001
  }
]
```

---

## Training Details

### Training Data

The model was trained on a curated dataset of English text annotated for **communicative tone**.

**Data characteristics:**

- Short-form written English
- Conversational and instructional text
- Neutral, emotional, and directive language
- No personally identifiable information (PII) intentionally included

> Exact dataset sources are not publicly released.

---

### Training Procedure

- Tokenization using a transformer-compatible tokenizer
- Supervised fine-tuning for multi-class classification
- Softmax output layer over tone labels

#### Training Hyperparameters

- **Max sequence length:** 128 tokens
- **Training regime:** fp32
- **Optimizer:** AdamW (standard configuration)
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Epochs:** 3-5

---

## Evaluation

### Evaluation Approach

The model was evaluated using:

- Held-out validation data
- Manual qualitative testing
- Real-world usage in API and browser-extension workflows

### Observed Strengths

- High accuracy on short queries and statements
- Strong differentiation between questioning vs informational tone
- Stable confidence distributions
- Low-latency inference on CPU

### Metrics

- **Accuracy:** ~92% on validation set
- **F1 Score (macro):** ~0.90

---

## Environmental Impact

The model was trained using standard GPU infrastructure. Exact carbon emissions were not formally measured.

- **Hardware:** GPU (cloud-based)
- **Cloud Provider:** Not disclosed
- **Compute Region:** Not disclosed
- **Training Time:** Approximately 2-4 hours

---

## Technical Specifications

### Model Architecture and Objective

- Transformer-based encoder (DistilBERT)
- Multi-class classification objective
- Softmax probability distribution over tone labels
- 8 output classes

### Compute Infrastructure

#### Hardware

- GPU for training
- CPU-friendly inference

#### Software

- Python 3.8+
- PyTorch 2.0+
- Hugging Face Transformers 4.30+

---

## Citation

If you use this model in your work, attribution is appreciated.

**BibTeX:**

```bibtex
@misc{tonebaselinev3,
  author = {Lokesh P},
  title = {Tone Baseline v3},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LokeshDevCreates/tone-baseline-v3}}
}
```

---

## Model Card Authors

- Lokesh P

---

## Model Card Contact

For questions or issues, please open an issue on the Hugging Face model repository or contact via the Hugging Face platform.

---

## Acknowledgments

This model was developed to support tone-aware text processing in production applications. Thanks to the Hugging Face community for providing excellent tools and infrastructure.