|
|
--- |
|
|
language: |
|
|
- tr |
|
|
base_model: |
|
|
- dbmdz/bert-base-turkish-cased |
|
|
pipeline_tag: text-classification |
|
|
tags: |
|
|
- text-classification |
|
|
- multi-label-classification |
|
|
- personality |
|
|
- bert |
|
|
- pytorch |
|
|
- transformers |
|
|
- turkish |
|
|
- classification |
|
|
- human-resources |
|
|
- custom-trained |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# bert\_turkish\_personality\_analysis |
|
|
|
|
|
This repository hosts a **Turkish BERT model fine-tuned for multi-label personality trait classification**. |
|
|
Built on top of `dbmdz/bert-base-turkish-cased`, this model predicts psychological and professional personality traits from Turkish text input. |
|
|
|
|
|
|
|
|
## 🎯 Task: Multi-label Personality Trait Detection |
|
|
|
|
|
Given a CV, personal statement, or written expression, the model assigns **zero or more traits** from the following set: |
|
|
|
|
|
### 🏷️ Supported Labels |
|
|
|
|
|
* `özgüvenli` – confident |
|
|
* `içe kapanık` – introverted |
|
|
* `lider` – leader |
|
|
* `takım oyuncusu` – team player |
|
|
* `kararsız` – indecisive |
|
|
* `abartılı` – exaggerated |
|
|
* `profesyonel` – professional |
|
|
* `deneyimli` – experienced |
|
|
|
|
|
The model supports **multi-label classification** using a sigmoid activation and thresholding logic. |
|
|
|
|
|
|
|
|
## 🔧 Usage Example |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load tokenizer and model |
|
|
model_name = "MUR55/bert_turkish_personality_analysis" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
|
|
|
|
# Sample text |
|
|
text = "5 yıllık yöneticilik tecrübemle liderlik becerilerimi geliştirdim, aynı zamanda ekip çalışmalarına önem veririm." |
|
|
|
|
|
# Tokenize and predict |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
|
outputs = model(**inputs) |
|
|
probs = torch.sigmoid(outputs.logits) |
|
|
|
|
|
# Threshold to determine label presence |
|
|
threshold = 0.5 |
|
|
labels = ["özgüvenli", "içe kapanık", "lider", "takım oyuncusu", "kararsız", "abartılı", "profesyonel", "deneyimli"] |
|
|
predicted = [label for label, prob in zip(labels, probs[0]) if prob >= threshold] |
|
|
|
|
|
print("Predicted traits:", predicted) |
|
|
``` |
|
|
|
|
|
|
|
|
## 🧠 Model Details |
|
|
|
|
|
* **Base model:** [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) |
|
|
* **Architecture:** BERT with a linear classification head |
|
|
* **Task type:** Multi-label classification |
|
|
* **Loss Function:** Binary Cross Entropy with Logits |
|
|
* **Training Data:** Custom Turkish dataset with personality trait annotations (e.g., CVs, social texts) |
|
|
|
|
|
|
|
|
## 📈 Performance |
|
|
|
|
|
Model was evaluated on a held-out portion of the dataset. Replace below with your real metrics: |
|
|
|
|
|
| Metric | Value | |
|
|
| --------- | ----- | |
|
|
| Accuracy | 0.92 | |
|
|
| F1-Score | 0.94 | |
|
|
| Precision | 0.91 | |
|
|
| Recall | 0.96 | |
|
|
|
|
|
|
|
|
## 🔍 Applications |
|
|
|
|
|
* CV analysis and candidate profiling |
|
|
* Smart recruiting and HR systems |
|
|
* Social media or forum persona evaluation |
|
|
* Turkish personality-aware recommendation systems |
|
|
|
|
|
|
|
|
## 📁 Files Included |
|
|
|
|
|
* `pytorch_model.bin` – fine-tuned model weights |
|
|
* `config.json` – model configuration |
|
|
* `tokenizer_config.json`, `vocab.txt` – tokenizer files |
|
|
|
|
|
|
|
|
## 🤝 Acknowledgments |
|
|
|
|
|
This project builds upon [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased). Thanks to the Turkish NLP community for contributions and datasets. |
|
|
|
|
|
|
|
|
## 📬 Contact |
|
|
|
|
|
If you have questions or suggestions, feel free to open an issue on the [model page](https://huggingface.co/MUR55/bert_turkish_personality_analysis) or contact the author. |