File size: 8,342 Bytes
34b8898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aee5b41
34b8898
 
aee5b41
34b8898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
---
license: apache-2.0
language:
- tr
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
---
# Turkish BERT for Aspect-Based Sentiment Analysis

This model is a fine-tuned version of [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) specifically trained for aspect-based sentiment analysis on Turkish e-commerce product reviews.

## Model Description

- **Base Model**: dbmdz/bert-base-turkish-cased
- **Task**: Sequence Classification (Aspect-Based Sentiment Analysis)
- **Language**: Turkish
- **Domain**: E-commerce product reviews

## Model Performance

- **F1 Score**: 88% on test set
- **Test Set Size**: 4,000 samples
- **Training Set Size**: 36,000 samples

## Training Details

### Training Data
- **Dataset Size**: 36,000 reviews
- **Data Source**: Private e-commerce product review dataset
- **Domain**: E-commerce product reviews in Turkish
- **Coverage**: Over 500 product categories

### Training Configuration
- **Epochs**: 5
- **Task Type**: Sequence Classification
- **Input Format**: `[aspect_term] [SEP] [review_text]`
- **Label Classes**: 
  - `positive`: Positive sentiment towards the aspect
  - `negative`: Negative sentiment towards the aspect
  - `neutral`: Neutral sentiment towards the aspect

### Training Loss
The model showed consistent improvement across epochs:

| Epoch | Loss |
|-------|------|
| 1     | 0.47 |
| 2     | 0.34 |
| 3     | 0.25 |
| 4     | 0.22 |
| 5     | 0.11 |

## Usage

### Option 1: Using Pipeline (Recommended)

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Create pipeline
sentiment_analyzer = pipeline("text-classification", 
                             model=model, 
                             tokenizer=tokenizer)

# Example usage
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
text = f"{aspect} [SEP] {review}"
result = sentiment_analyzer(text)
print(result)
```

**Expected Output:**
```python
[{'label': 'positive', 'score': 0.9998155236244202}]
```

### Option 2: Manual Inference

```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect and review
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."

# Tokenize aspect and review together
inputs = tokenizer(aspect, review, return_tensors="pt", truncation=True, padding=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_id = predictions.argmax(dim=-1).item()
    confidence = predictions.max().item()

# Convert prediction to label
predicted_label = model.config.id2label[predicted_class_id]
print(f"Aspect: {aspect}")
print(f"Sentiment: {predicted_label}")
print(f"Confidence: {confidence:.4f}")
```

**Expected Output:**
```
Aspect: arka kamerası
Sentiment: positive
Confidence: 0.9998
```

### Option 3: Batch Inference

```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect-review pairs
examples = [
    ("arka kamerası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("bataryası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("fiyatı", "Ürünün fiyatı çok uygun ve kalitesi de iyi."),
]

aspects = [ex[0] for ex in examples]
reviews = [ex[1] for ex in examples]

# Tokenize all pairs
inputs = tokenizer(aspects, reviews, return_tensors="pt", truncation=True, padding=True)

# Get predictions for all pairs
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_ids = predictions.argmax(dim=-1)
    confidences = predictions.max(dim=-1).values

# Display results
for i, (aspect, review) in enumerate(examples):
    predicted_label = model.config.id2label[predicted_class_ids[i].item()]
    confidence = confidences[i].item()
    print(f"Aspect: {aspect}")
    print(f"Sentiment: {predicted_label} (confidence: {confidence:.4f})")
    print("-" * 40)
```

**Expected Output:**
```
Aspect: arka kamerası
Sentiment: positive (confidence: 0.9998)

Aspect: bataryası
Sentiment: negative (confidence: 0.9990)

Aspect: fiyatı
Sentiment: positive (confidence: 0.9998)
```

## Combined Usage with Aspect Extraction (Recommended)

This model works perfectly with the aspect extraction model [opdullah/bert-turkish-ecomm-aspect-extraction](https://huggingface.co/opdullah/bert-turkish-ecomm-aspect-extraction) for complete aspect-based sentiment analysis:

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification, AutoModelForSequenceClassification, pipeline
import torch

# Load aspect extraction model
aspect_extractor = pipeline("token-classification", 
                           model="opdullah/bert-turkish-ecomm-aspect-extraction", 
                           aggregation_strategy="simple")

# Load sentiment analysis model
sentiment_tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
sentiment_model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

def analyze_aspect_sentiment(review):
    # Extract aspects
    aspects = aspect_extractor(review)
    
    results = []
    for aspect in aspects:
        if aspect['entity_group'] == 'ASPECT':
            aspect_text = aspect['word']
            
            # Analyze sentiment
            inputs = sentiment_tokenizer(aspect_text, review, return_tensors="pt", truncation=True)
            with torch.no_grad():
                outputs = sentiment_model(**inputs)
                prediction = outputs.logits.argmax().item()
                sentiment = sentiment_model.config.id2label[prediction]
            
            results.append({'aspect': aspect_text, 'sentiment': sentiment})
    
    return results

# Usage
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
results = analyze_aspect_sentiment(review)

for result in results:
    print(f"{result['aspect']}: {result['sentiment']}")
```

**Expected Output:**
```
arka kamerası: positive
bataryası: negative
```

## Label Mapping

```python
id2label = {
    0: "negative",
    1: "neutral", 
    2: "positive"
}

label2id = {
    "negative": 0,
    "neutral": 1,
    "positive": 2
}
```

## Intended Use

This model is designed for:
- Analyzing sentiment of specific aspects in Turkish e-commerce product reviews
- Building complete aspect-based sentiment analysis systems
- Understanding customer opinions on specific product features
- Supporting recommendation systems and review analysis tools

## Limitations

- Trained specifically on e-commerce domain data
- Requires aspect terms to be identified beforehand (use with aspect extraction model)
- Performance may vary on other domains or text types  
- Limited to Turkish language
- Based on private dataset, so reproducibility may be limited

## Citation

If you use this model, please cite:

```
@misc{turkish-bert-absa,
  title={Turkish BERT for Aspect-Based Sentiment Analysis},
  author={Abdullah Koçak},
  year={2025},
  url={https://huggingface.co/opdullah/bert-turkish-ecomm-absa}
}
```

## Base Model Citation

```
@misc{schweter2020bertbase,
  title={BERTurk - BERT models for Turkish},
  author={Stefan Schweter},
  year={2020},
  url={https://huggingface.co/dbmdz/bert-base-turkish-cased}
}
```

## Related Models

- [opdullah/bert-turkish-ecomm-aspect-extraction](https://huggingface.co/opdullah/bert-turkish-ecomm-aspect-extraction) - For extracting aspect terms from Turkish e-commerce reviews