File size: 6,942 Bytes

1f04272
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
 
 
1f04272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1e9fb52
 
cd8c9b4
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
 
 
cd8c9b4
 
 
 
 
1f04272
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
1e9fb52
cd8c9b4
 
 
 
 
1e9fb52
cd8c9b4
 
 
1f04272
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
1e9fb52
cd8c9b4
 
 
 
 
1e9fb52
1f04272
 
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1f04272
 
 
 
 
 
 
 
 
 
 
 
cd8c9b4
1f04272
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
1f04272
 
 
 
 
 
 
 
 
cd8c9b4
 
1e9fb52
 
 
cd8c9b4
1e9fb52
1f04272
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
 
 
1f04272
 
 
 
1e9fb52
 
 
 
 
1f04272
 
1e9fb52
 
 
1f04272
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
 
 
 
 
 
 
 
 
1e9fb52
cd8c9b4
1e9fb52
cd8c9b4
1e9fb52
1f04272
1e9fb52
cd8c9b4
1e9fb52
 
 
1f04272
 
 
 
 
cd8c9b4
1f04272

---
library_name: transformers
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- tone-detection
- text-classification
- nlp
- transformers
- production
base_model: distilbert-base-uncased
datasets:
- custom
metrics:
- accuracy
- f1
widget:
- text: "Can you explain this again?"
  example_title: "Questioning"
- text: "I strongly disagree with this decision."
  example_title: "Assertive"
- text: "This is absolutely terrible!"
  example_title: "Frustrated"
- text: "Great job on the presentation!"
  example_title: "Enthusiastic"
- text: "Here are the key findings from the report."
  example_title: "Informational"
---

# Tone Baseline v3

## Model Summary

**Tone Baseline v3** is a lightweight English text classification model designed to detect the **communicative tone** of short-form text.

The model predicts a **single dominant tone**, along with a confidence score and probability distribution across all supported tone categories. It is optimized for **real-world production use**, including writing assistants, browser extensions, and backend APIs.

---

## Model Details

### Model Description

- **Developed by:** Lokesh P  
- **Model type:** Multi-class text classification (tone detection)  
- **Language(s):** English  
- **License:** Apache 2.0  
- **Framework:** Hugging Face Transformers (PyTorch)  
- **Base Model:** DistilBERT (distilbert-base-uncased)

The model is intended to be used as a **pre-processing or analysis component** in applications that need to understand how a piece of text is phrased (e.g., polite vs rude, questioning vs informational), rather than what the text is about.

---

### Supported Tone Labels

The model predicts one of the following tone labels:

- `supportive`
- `enthusiastic`
- `frustrated`
- `rude`
- `informational`
- `questioning`
- `formal`
- `assertive`

---

## Uses

### Direct Use

This model can be used directly for:

- Tone detection in messages, emails, or chat inputs
- UX feedback on how a message may be perceived
- Pre-routing text into different rewrite or moderation pipelines
- Writing assistance tools

Example direct usage:

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

text = "I strongly disagree with this decision."
result = classifier(text)
print(result)
```

---

### Downstream Use

The model is commonly used as part of a larger system, for example:

- As an **input signal** for text rewriting systems
- As a **decision layer** before invoking a generative model
- As part of browser extensions or API services
- As a lightweight moderation or feedback component

---

### Out-of-Scope Use

This model **should NOT** be used for:

- Psychological or mental health diagnosis
- Personality inference
- Detecting intent, deception, or truthfulness
- Legal, medical, or safety-critical decision making
- Surveillance or profiling of individuals

The model classifies **text tone only**, not user intent or emotional state.

---

## Bias, Risks, and Limitations

### Known Limitations

- English-only
- Performs best on short to medium-length text
- May misclassify sarcasm or highly contextual statements
- Sensitive to ambiguous phrasing
- Not designed for long documents or multi-paragraph inputs

### Bias Considerations

The model reflects patterns present in its training data and may encode biases related to tone interpretation. Outputs should be treated as **assistive signals**, not absolute judgments.

---

### Recommendations

- Use the model as **one signal among many**, not a final authority
- Avoid high-stakes automated decisions based solely on model output
- Perform task-specific evaluation before deployment in sensitive domains

---

## How to Get Started

### Using Transformers

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="LokeshDevCreates/tone-baseline-v3",
    top_k=None
)

result = classifier("Can you explain this again?")
print(result)
```

### Example Output

```json
[
  {
    "label": "questioning",
    "score": 0.9992
  },
  {
    "label": "supportive",
    "score": 0.0002
  },
  {
    "label": "informational",
    "score": 0.0001
  }
]
```

---

## Training Details

### Training Data

The model was trained on a curated dataset of English text annotated for **communicative tone**.

**Data characteristics:**

- Short-form written English
- Conversational and instructional text
- Neutral, emotional, and directive language
- No personally identifiable information (PII) intentionally included

> Exact dataset sources are not publicly released.

---

### Training Procedure

- Tokenization using a transformer-compatible tokenizer
- Supervised fine-tuning for multi-class classification
- Softmax output layer over tone labels

#### Training Hyperparameters

- **Max sequence length:** 128 tokens
- **Training regime:** fp32
- **Optimizer:** AdamW (standard configuration)
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Epochs:** 3-5

---

## Evaluation

### Evaluation Approach

The model was evaluated using:

- Held-out validation data
- Manual qualitative testing
- Real-world usage in API and browser-extension workflows

### Observed Strengths

- High accuracy on short queries and statements
- Strong differentiation between questioning vs informational tone
- Stable confidence distributions
- Low-latency inference on CPU

### Metrics

- **Accuracy:** ~92% on validation set
- **F1 Score (macro):** ~0.90

---

## Environmental Impact

The model was trained using standard GPU infrastructure. Exact carbon emissions were not formally measured.

- **Hardware:** GPU (cloud-based)
- **Cloud Provider:** Not disclosed
- **Compute Region:** Not disclosed
- **Training Time:** Approximately 2-4 hours

---

## Technical Specifications

### Model Architecture and Objective

- Transformer-based encoder (DistilBERT)
- Multi-class classification objective
- Softmax probability distribution over tone labels
- 8 output classes

### Compute Infrastructure

#### Hardware

- GPU for training
- CPU-friendly inference

#### Software

- Python 3.8+
- PyTorch 2.0+
- Hugging Face Transformers 4.30+

---

## Citation

If you use this model in your work, attribution is appreciated.

**BibTeX:**

```bibtex
@misc{tonebaselinev3,
  author = {Lokesh P},
  title = {Tone Baseline v3},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LokeshDevCreates/tone-baseline-v3}}
}
```

---

## Model Card Authors

- Lokesh P

---

## Model Card Contact

For questions or issues, please open an issue on the Hugging Face model repository or contact via the Hugging Face platform.

---

## Acknowledgments

This model was developed to support tone-aware text processing in production applications. Thanks to the Hugging Face community for providing excellent tools and infrastructure.