Create README.md

fb273fd verified 11 months ago

3.76 kB

	# 💬 Sentiment-Analysis-for-Product-Release-Sentiment

	A BERT-based sentiment analysis model fine-tuned on a product review dataset. It predicts the sentiment of a text as Positive, Neutral, or Negative with a confidence score. This model is ideal for analyzing customer feedback, reviews, or user comments.

	---

	## ✨ Model Highlights

	- 📌 Architecture: Based on [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) by Google
	- 🔧 Fine-tuned on labeled product review data
	- 🔍 3-way sentiment classification: `Negative (0)`, `Neutral (1)`, `Positive (2)`
	- 💾 Quantized version available for faster inference

	---

	## 🧠 Intended Uses

	- ✅ Classifying product feedback and user reviews
	- ✅ Sentiment analysis for e-commerce platforms
	- ✅ Social media monitoring and customer opinion mining

	---

	## 🚫 Limitations

	- ❌ Designed for English texts only
	- ❌ May not perform well on sarcastic or ironic inputs
	- ❌ May struggle with domains very different from product reviews
	- ❌ Input texts longer than 128 tokens are truncated

	---

	## 🏋️‍♂️ Training Details

	- Base Model: `bert-base-uncased`
	- Dataset: Custom-labeled product review dataset
	- Epochs: 5
	- Batch Size: 8
	- Max Length: 128 tokens
	- Optimizer: AdamW
	- Loss Function: CrossEntropyLoss (with class balancing)
	- Hardware: Trained on NVIDIA GPU (CUDA-enabled)

	---

	## 📊 Evaluation Metrics

	\| Metric \| Score \|
	\|------------\|-------\|
	\| Accuracy \| 0.90 \|
	\| F1 \| 0.90 \|
	\| Precision \| 0.90 \|
	\| Recall \| 0.90 \|

	---

	## 🔎 Label Mapping

	\| Label ID \| Sentiment \|
	\|----------\|-----------\|
	\| 0 \| Negative \|
	\| 1 \| Neutral \|
	\| 2 \| Positive \|

	---

	## 🚀 Usage Example

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
	from transformers import DataCollatorWithPadding
	import torch
	import torch.nn.functional as F

	# Load model and tokenizer
	model_name = "AventIQ-AI/Sentiment-Analysis-for-Product-Release-Sentiment"
	tokenizer = BertTokenizer.from_pretrained(model_name)
	model = BertForSequenceClassification.from_pretrained(model_name)
	model.eval()

	# Inference
	def predict_sentiment(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
	inputs = {k: v.to(quantized_model.device) for k, v in inputs.items()}
	with torch.no_grad():
	logits = quantized_model(**inputs).logits
	probs = F.softmax(logits, dim=1)

	predicted_class_id = torch.argmax(probs, dim=1).item()
	confidence = probs[0][predicted_class_id].item()

	label_map = {0: "Negative", 1: "Positive"}
	label = label_map[predicted_class_id]
	confidence_str = f"confidence : {confidence * 100:.1f}%"

	return label, confidence_str

	# Example
	print(predict_sentiment(
	"The service was excellent and the staff was friendly.")
	)
	```

	---

	## 🧪 Quantization

	- Applied post-training dynamic quantization using PyTorch to reduce model size and speed up inference.
	- Quantized model supports CPU-based deployments.

	---

	## 📁 Repository Structure

	```
	.
	├── model/ # Quantized model files
	├── tokenizer/ # Tokenizer config and vocabulary
	├── model.safetensors/ # Fine-tuned full-precision model
	├── README.md # Model documentation
	```

	---

	## 📌 Limitations

	- May not generalize to completely different domains (e.g., medical, legal)
	- Quantized version may show slight drop in accuracy compared to full-precision model

	---

	## 🤝 Contributing

	We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion.