Vibescribe / README.md
shaheerawan3's picture
Update README.md
c809907 verified
---
license: mit
---
# Sentiment Analysis Model (Vibescribe)
Vibescribe built with Hugging Face Transformers, fine-tuned on IMDB reviews.
## Setup
1. Clone the repository:
```bash
git clone https://github.com/your-username/sentiment-analysis
cd sentiment-analysis
```
2. Create virtual environment:
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Log in to Hugging Face:
```bash
huggingface-cli login
```
## Project Structure
```
sentiment-analysis/
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ train.py
β”œβ”€β”€ inference.py
β”œβ”€β”€ utils.py
└── README.md
```
## Files to Create
### requirements.txt
```
transformers==4.37.2
datasets==2.16.1
torch==2.1.2
scikit-learn==1.4.0
```
### utils.py
```python
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
def compute_metrics(pred):
labels = pred.label_ids
preds = pred.predictions.argmax(-1)
precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
return {
'accuracy': accuracy_score(labels, preds),
'f1': f1,
'precision': precision,
'recall': recall
}
```
### inference.py
```python
from transformers import pipeline
def load_model(model_path):
return pipeline("sentiment-analysis", model=model_path)
def predict(classifier, text):
return classifier(text)
if __name__ == "__main__":
model_path = "your-username/sentiment-analysis-model"
classifier = load_model(model_path)
# Example prediction
text = "This movie was really great!"
result = predict(classifier, text)
print(f"Text: {text}\nSentiment: {result}")
```
## Training
1. Update model configuration in `train.py`:
```python
training_args = TrainingArguments(
output_dir="sentiment-analysis-model",
hub_model_id="your-username/sentiment-analysis-model", # Change this
...
)
```
2. Start training:
```bash
python train.py
```
## Making Predictions
```python
from inference import load_model, predict
classifier = load_model("your-username/sentiment-analysis-model")
result = predict(classifier, "Your text here")
```
## Model Details
- Base model: DistilBERT
- Dataset: IMDB Reviews
- Task: Binary sentiment classification (positive/negative)
- Training time: ~2-3 hours on GPU
- Model size: ~260MB
## Performance Metrics
- Accuracy: ~91-93%
- F1 Score: ~91-92%
- Precision: ~90-91%
- Recall: ~91-92%
## Contributing
1. Fork the repository
2. Create feature branch
3. Commit changes
4. Push to branch
5. Open pull request
## License
MIT License