File size: 2,661 Bytes
f1d8a98
 
 
c809907
f1d8a98
c809907
f1d8a98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
license: mit
---
# Sentiment Analysis Model (Vibescribe)

Vibescribe built with Hugging Face Transformers, fine-tuned on IMDB reviews.

## Setup

1. Clone the repository:
```bash
git clone https://github.com/your-username/sentiment-analysis
cd sentiment-analysis
```

2. Create virtual environment:
```bash
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. Install dependencies:
```bash
pip install -r requirements.txt
```

4. Log in to Hugging Face:
```bash
huggingface-cli login
```

## Project Structure
```
sentiment-analysis/
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ train.py
β”œβ”€β”€ inference.py
β”œβ”€β”€ utils.py
└── README.md
```

## Files to Create

### requirements.txt
```
transformers==4.37.2
datasets==2.16.1
torch==2.1.2
scikit-learn==1.4.0
```

### utils.py
```python
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    return {
        'accuracy': accuracy_score(labels, preds),
        'f1': f1,
        'precision': precision,
        'recall': recall
    }
```

### inference.py
```python
from transformers import pipeline

def load_model(model_path):
    return pipeline("sentiment-analysis", model=model_path)

def predict(classifier, text):
    return classifier(text)

if __name__ == "__main__":
    model_path = "your-username/sentiment-analysis-model"
    classifier = load_model(model_path)
    
    # Example prediction
    text = "This movie was really great!"
    result = predict(classifier, text)
    print(f"Text: {text}\nSentiment: {result}")
```

## Training

1. Update model configuration in `train.py`:
```python
training_args = TrainingArguments(
    output_dir="sentiment-analysis-model",
    hub_model_id="your-username/sentiment-analysis-model",  # Change this
    ...
)
```

2. Start training:
```bash
python train.py
```

## Making Predictions

```python
from inference import load_model, predict

classifier = load_model("your-username/sentiment-analysis-model")
result = predict(classifier, "Your text here")
```

## Model Details

- Base model: DistilBERT
- Dataset: IMDB Reviews
- Task: Binary sentiment classification (positive/negative)
- Training time: ~2-3 hours on GPU
- Model size: ~260MB

## Performance Metrics

- Accuracy: ~91-93%
- F1 Score: ~91-92%
- Precision: ~90-91%
- Recall: ~91-92%

## Contributing

1. Fork the repository
2. Create feature branch
3. Commit changes
4. Push to branch
5. Open pull request

## License

MIT License