--- library_name: transformers license: apache-2.0 language: - en pipeline_tag: text-classification tags: - tone-detection - text-classification - nlp - transformers - production base_model: distilbert-base-uncased datasets: - custom metrics: - accuracy - f1 widget: - text: "Can you explain this again?" example_title: "Questioning" - text: "I strongly disagree with this decision." example_title: "Assertive" - text: "This is absolutely terrible!" example_title: "Frustrated" - text: "Great job on the presentation!" example_title: "Enthusiastic" - text: "Here are the key findings from the report." example_title: "Informational" --- # Tone Baseline v3 ## Model Summary **Tone Baseline v3** is a lightweight English text classification model designed to detect the **communicative tone** of short-form text. The model predicts a **single dominant tone**, along with a confidence score and probability distribution across all supported tone categories. It is optimized for **real-world production use**, including writing assistants, browser extensions, and backend APIs. --- ## Model Details ### Model Description - **Developed by:** Lokesh P - **Model type:** Multi-class text classification (tone detection) - **Language(s):** English - **License:** Apache 2.0 - **Framework:** Hugging Face Transformers (PyTorch) - **Base Model:** DistilBERT (distilbert-base-uncased) The model is intended to be used as a **pre-processing or analysis component** in applications that need to understand how a piece of text is phrased (e.g., polite vs rude, questioning vs informational), rather than what the text is about. --- ### Supported Tone Labels The model predicts one of the following tone labels: - `supportive` - `enthusiastic` - `frustrated` - `rude` - `informational` - `questioning` - `formal` - `assertive` --- ## Uses ### Direct Use This model can be used directly for: - Tone detection in messages, emails, or chat inputs - UX feedback on how a message may be perceived - Pre-routing text into different rewrite or moderation pipelines - Writing assistance tools Example direct usage: ```python from transformers import pipeline classifier = pipeline( "text-classification", model="LokeshDevCreates/tone-baseline-v3", top_k=None ) text = "I strongly disagree with this decision." result = classifier(text) print(result) ``` --- ### Downstream Use The model is commonly used as part of a larger system, for example: - As an **input signal** for text rewriting systems - As a **decision layer** before invoking a generative model - As part of browser extensions or API services - As a lightweight moderation or feedback component --- ### Out-of-Scope Use This model **should NOT** be used for: - Psychological or mental health diagnosis - Personality inference - Detecting intent, deception, or truthfulness - Legal, medical, or safety-critical decision making - Surveillance or profiling of individuals The model classifies **text tone only**, not user intent or emotional state. --- ## Bias, Risks, and Limitations ### Known Limitations - English-only - Performs best on short to medium-length text - May misclassify sarcasm or highly contextual statements - Sensitive to ambiguous phrasing - Not designed for long documents or multi-paragraph inputs ### Bias Considerations The model reflects patterns present in its training data and may encode biases related to tone interpretation. Outputs should be treated as **assistive signals**, not absolute judgments. --- ### Recommendations - Use the model as **one signal among many**, not a final authority - Avoid high-stakes automated decisions based solely on model output - Perform task-specific evaluation before deployment in sensitive domains --- ## How to Get Started ### Using Transformers ```python from transformers import pipeline classifier = pipeline( "text-classification", model="LokeshDevCreates/tone-baseline-v3", top_k=None ) result = classifier("Can you explain this again?") print(result) ``` ### Example Output ```json [ { "label": "questioning", "score": 0.9992 }, { "label": "supportive", "score": 0.0002 }, { "label": "informational", "score": 0.0001 } ] ``` --- ## Training Details ### Training Data The model was trained on a curated dataset of English text annotated for **communicative tone**. **Data characteristics:** - Short-form written English - Conversational and instructional text - Neutral, emotional, and directive language - No personally identifiable information (PII) intentionally included > Exact dataset sources are not publicly released. --- ### Training Procedure - Tokenization using a transformer-compatible tokenizer - Supervised fine-tuning for multi-class classification - Softmax output layer over tone labels #### Training Hyperparameters - **Max sequence length:** 128 tokens - **Training regime:** fp32 - **Optimizer:** AdamW (standard configuration) - **Learning rate:** 2e-5 - **Batch size:** 16 - **Epochs:** 3-5 --- ## Evaluation ### Evaluation Approach The model was evaluated using: - Held-out validation data - Manual qualitative testing - Real-world usage in API and browser-extension workflows ### Observed Strengths - High accuracy on short queries and statements - Strong differentiation between questioning vs informational tone - Stable confidence distributions - Low-latency inference on CPU ### Metrics - **Accuracy:** ~92% on validation set - **F1 Score (macro):** ~0.90 --- ## Environmental Impact The model was trained using standard GPU infrastructure. Exact carbon emissions were not formally measured. - **Hardware:** GPU (cloud-based) - **Cloud Provider:** Not disclosed - **Compute Region:** Not disclosed - **Training Time:** Approximately 2-4 hours --- ## Technical Specifications ### Model Architecture and Objective - Transformer-based encoder (DistilBERT) - Multi-class classification objective - Softmax probability distribution over tone labels - 8 output classes ### Compute Infrastructure #### Hardware - GPU for training - CPU-friendly inference #### Software - Python 3.8+ - PyTorch 2.0+ - Hugging Face Transformers 4.30+ --- ## Citation If you use this model in your work, attribution is appreciated. **BibTeX:** ```bibtex @misc{tonebaselinev3, author = {Lokesh P}, title = {Tone Baseline v3}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/LokeshDevCreates/tone-baseline-v3}} } ``` --- ## Model Card Authors - Lokesh P --- ## Model Card Contact For questions or issues, please open an issue on the Hugging Face model repository or contact via the Hugging Face platform. --- ## Acknowledgments This model was developed to support tone-aware text processing in production applications. Thanks to the Hugging Face community for providing excellent tools and infrastructure.