AventIQ-AI
/

finbert-sentiment-analysis

Model card Files Files and versions

finbert-sentiment-analysis / README.md

developerPushkal's picture

developerPushkal

Update README.md

9b7d714 verified about 1 year ago

|

history blame contribute delete

3.22 kB

	# FinBERT Sentiment Analysis on English/Quotes Dataset

	## 📌 Overview

	This repository hosts the FinBERT model fine-tuned for sentiment analysis using the English/Quotes dataset. The model classifies text into sentiment categories such as positive, negative, or neutral.

	## 🏗 Model Details

	- Model Architecture: FinBERT (BERT-based model for sentiment analysis)
	- Task: Sentiment Analysis
	- Dataset: English/quotes dataset
	- Fine-tuning Framework: Hugging Face Transformers

	## 🚀 Usage

	### Installation

	```bash
	pip install transformers torch
	```

	### Loading the Model

	```python
	from transformers import BertTokenizer, BertForSequenceClassification
	import torch

	device = "cuda" if torch.cuda.is_available() else "cpu"

	model_name = "Aventiq-AI/finbert-english/quotes"
	model = BertForSequenceClassification.from_pretrained(model_name).to(device)
	tokenizer = BertTokenizer.from_pretrained(model_name)
	```

	### Sentiment Classification Inference

	```python
	def predict_sentiment(text):
	inputs = tokenizer(text, padding="max_length", truncation=True, max_length=128, return_tensors="pt")
	inputs = {key: val.to(device) for key, val in inputs.items()} # Move inputs to device
	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	prediction = torch.argmax(logits, dim=-1).item()
	label_map = {0: "negative", 1: "neutral", 2: "positive"}
	return label_map[prediction]

	# Test on the original 5 quotes
	original_quotes = [
	"“Be yourself; everyone else is already taken.”",
	"“I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.”",
	"“Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.”",
	"“So many books, so little time.”",
	"“A room without books is like a body without a soul.”"
	]

	print("Predictions for original quotes:")
	for quote in original_quotes:
	pred = predict_sentiment(quote)
	print(f"Quote: {quote}\nPredicted Sentiment: {pred}\n")

	# Test on a new example
	new_quote = "Life is beautiful when you smile."
	print("Prediction for new quote:")
	print(f"Quote: {new_quote}\nPredicted Sentiment: {predict_sentiment(new_quote)}")
	```

	## 📊 Evaluation Metric: Accuracy & F1 Score

	For sentiment analysis, accuracy and F1-score are key evaluation metrics. The model achieves:
	- Accuracy: 88%
	- F1 Score: 0.85

	## 📂 Repository Structure

	```
	.
	├── model/ # Contains the fine-tuned model files
	├── tokenizer_config/ # Tokenizer configuration and vocabulary files
	├── model.safetensors/ # Model weights
	├── README.md # Model documentation
	```

	## ⚠️ Limitations

	- The model may struggle with ambiguous phrases.
	- Performance might vary across different jurisdictions and terminologies.
	- The dataset primarily contains English text, making it less effective for multilingual applications.

	## 🤝 Contributing

	Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.