Update README.md

7770c85 verified about 2 months ago

4.11 kB

	---
	library_name: transformers
	tags:
	- emotion-classification
	- text-classification
	- roberta
	- goemotions
	- sentiment-analysis
	license: mit
	datasets:
	- google-research-datasets/go_emotions
	language:
	- en
	metrics:
	- accuracy
	- f1
	base_model: roberta-base
	---

	# RoBERTa Emotion Classifier (7-class)

	Fine-tuned RoBERTa model for emotion classification on 7 emotions: happy, sad, angry, fear, disgust, surprise, neutral.

	## Model Details

	- Developed by: VanshajR
	- Base Model: `roberta-base` (125M parameters)
	- Task: Multi-class emotion classification
	- Dataset: GoEmotions (27 emotions mapped to 7)
	- Training Samples: ~58,000
	- Language: English
	- License: MIT

	## Performance

	Evaluated on GoEmotions test set:

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 57.77% \|
	\| Macro F1 \| 0.4787 \|
	\| Precision \| 0.5289 \|
	\| Recall \| 0.4958 \|

	### Per-Class Performance

	\| Emotion \| Precision \| Recall \| F1-Score \| Support \|
	\|---------\|-----------\|--------\|----------\|---------\|
	\| Happy \| 0.62 \| 0.67 \| 0.64 \| 2,362 \|
	\| Sad \| 0.54 \| 0.51 \| 0.52 \| 1,210 \|
	\| Angry \| 0.58 \| 0.43 \| 0.49 \| 1,145 \|
	\| Fear \| 0.42 \| 0.31 \| 0.36 \| 428 \|
	\| Disgust \| 0.48 \| 0.26 \| 0.34 \| 361 \|
	\| Surprise \| 0.43 \| 0.43 \| 0.43 \| 623 \|
	\| Neutral \| 0.64 \| 0.86 \| 0.73 \| 8,711 \|

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("VanshajR/roberta-emotion-7class")
	model = AutoModelForSequenceClassification.from_pretrained("VanshajR/roberta-emotion-7class")

	# Classify emotion
	text = "I'm so excited about this project!"
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
	predicted_class = torch.argmax(predictions, dim=-1).item()

	# Emotion labels
	emotions = ["happy", "sad", "angry", "fear", "disgust", "surprise", "neutral"]
	print(f"Predicted emotion: {emotions[predicted_class]}")
	print(f"Confidence: {predictions[0][predicted_class].item():.2%}")
	```

	## Training Details

	### Training Data

	- Dataset: GoEmotions (Google Research)
	- Emotion Mapping: 27 fine-grained emotions → 7 basic emotions
	- Training Samples: ~58,000 Reddit comments
	- Preprocessing: Truncation to 128 tokens, lowercase normalization

	### Training Procedure

	- Optimizer: AdamW (lr=2e-5, weight_decay=0.01)
	- Batch Size: 16 (train), 32 (eval)
	- Epochs: 3
	- Max Length: 128 tokens
	- Training Regime: fp32

	### Compute Infrastructure

	- Hardware: NVIDIA RTX 3070 (8GB VRAM)
	- Training Time: ~2 hours
	- Framework: PyTorch 2.1.0, Transformers 4.35.0

	## Limitations and Bias

	- Language: English only
	- Domain: Primarily trained on Reddit comments (may not generalize to formal text)
	- Class Imbalance: Better performance on frequent emotions (happy, neutral) vs rare emotions (fear, disgust)
	- Subjective Task: Human annotators often disagree on emotions (~25-30% disagreement rate)

	## Intended Use

	✅ Recommended:
	- Emotion detection in conversational text
	- Evaluating emotion-controlled text generation
	- Research on emotion understanding in dialogue
	- Sentiment analysis applications

	❌ Not Recommended:
	- Clinical diagnosis or mental health assessment
	- High-stakes decision making
	- Non-English languages

	## Citation

	```bibtex
	@misc{vanshajr2024roberta,
	author = {Vanshaj R},
	title = {RoBERTa Emotion Classifier for 7-Class Emotion Detection},
	year = {2024},
	publisher = {HuggingFace},
	url = {https://huggingface.co/VanshajR/roberta-emotion-7class}
	}
	```

	## Related Work

	Part of the Emotion-Controlled Response Generation project:
	- 🔗 [GitHub Repository](https://github.com/VanshajR/emotion-controlled-generation)
	- 🔗 [GPT-2 Emotion-Conditioned Model](https://huggingface.co/VanshajR/gpt2-emotion-prefix)
	- 📄 [Full Project Report](https://github.com/VanshajR/emotion-controlled-generation/blob/main/PROJECT_REPORT.md)