File size: 3,492 Bytes
cbae917
 
 
 
 
219cd90
777a5d3
61b8aa5
777a5d3
61b8aa5
 
 
c5ed893
61b8aa5
 
 
 
7fb27ca
8f7e44d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55e1567
 
 
 
8f7e44d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
777a5d3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
language:
- tr
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
tags:
- text-classification
- multi-label-classification
- personality
- bert
- pytorch
- transformers
- turkish
- classification
- human-resources
- custom-trained
license: apache-2.0
---

# bert\_turkish\_personality\_analysis

This repository hosts a **Turkish BERT model fine-tuned for multi-label personality trait classification**.
Built on top of `dbmdz/bert-base-turkish-cased`, this model predicts psychological and professional personality traits from Turkish text input.


## 🎯 Task: Multi-label Personality Trait Detection

Given a CV, personal statement, or written expression, the model assigns **zero or more traits** from the following set:

### 🏷️ Supported Labels

* `özgüvenli` – confident
* `içe kapanık` – introverted
* `lider` – leader
* `takım oyuncusu` – team player
* `kararsız` – indecisive
* `abartılı` – exaggerated
* `profesyonel` – professional
* `deneyimli` – experienced

The model supports **multi-label classification** using a sigmoid activation and thresholding logic.


## 🔧 Usage Example

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model_name = "MUR55/bert_turkish_personality_analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Sample text
text = "5 yıllık yöneticilik tecrübemle liderlik becerilerimi geliştirdim, aynı zamanda ekip çalışmalarına önem veririm."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)

# Threshold to determine label presence
threshold = 0.5
labels = ["özgüvenli", "içe kapanık", "lider", "takım oyuncusu", "kararsız", "abartılı", "profesyonel", "deneyimli"]
predicted = [label for label, prob in zip(labels, probs[0]) if prob >= threshold]

print("Predicted traits:", predicted)
```


## 🧠 Model Details

* **Base model:** [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased)
* **Architecture:** BERT with a linear classification head
* **Task type:** Multi-label classification
* **Loss Function:** Binary Cross Entropy with Logits
* **Training Data:** Custom Turkish dataset with personality trait annotations (e.g., CVs, social texts)


## 📈 Performance

Model was evaluated on a held-out portion of the dataset. Replace below with your real metrics:

| Metric    | Value |
| --------- | ----- |
| Accuracy  | 0.92   |
| F1-Score  | 0.94  |
| Precision | 0.91  |
| Recall    | 0.96  |


## 🔍 Applications

* CV analysis and candidate profiling
* Smart recruiting and HR systems
* Social media or forum persona evaluation
* Turkish personality-aware recommendation systems


## 📁 Files Included

* `pytorch_model.bin` – fine-tuned model weights
* `config.json` – model configuration
* `tokenizer_config.json`, `vocab.txt` – tokenizer files


## 🤝 Acknowledgments

This project builds upon [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased). Thanks to the Turkish NLP community for contributions and datasets.


## 📬 Contact

If you have questions or suggestions, feel free to open an issue on the [model page](https://huggingface.co/MUR55/bert_turkish_personality_analysis) or contact the author.