bert_turkish_personality_analysis / README.md

MUR55

Update README.md

7fb27ca verified 8 months ago

preview code

raw

history blame contribute delete

3.49 kB

metadata

language:
  - tr
base_model:
  - dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
tags:
  - text-classification
  - multi-label-classification
  - personality
  - bert
  - pytorch
  - transformers
  - turkish
  - classification
  - human-resources
  - custom-trained
license: apache-2.0

bert_turkish_personality_analysis

This repository hosts a Turkish BERT model fine-tuned for multi-label personality trait classification. Built on top of dbmdz/bert-base-turkish-cased, this model predicts psychological and professional personality traits from Turkish text input.

🎯 Task: Multi-label Personality Trait Detection

Given a CV, personal statement, or written expression, the model assigns zero or more traits from the following set:

🏷️ Supported Labels

özgüvenli – confident
içe kapanık – introverted
lider – leader
takım oyuncusu – team player
kararsız – indecisive
abartılı – exaggerated
profesyonel – professional
deneyimli – experienced

The model supports multi-label classification using a sigmoid activation and thresholding logic.

🔧 Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model_name = "MUR55/bert_turkish_personality_analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Sample text
text = "5 yıllık yöneticilik tecrübemle liderlik becerilerimi geliştirdim, aynı zamanda ekip çalışmalarına önem veririm."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)

# Threshold to determine label presence
threshold = 0.5
labels = ["özgüvenli", "içe kapanık", "lider", "takım oyuncusu", "kararsız", "abartılı", "profesyonel", "deneyimli"]
predicted = [label for label, prob in zip(labels, probs[0]) if prob >= threshold]

print("Predicted traits:", predicted)

🧠 Model Details

Base model: dbmdz/bert-base-turkish-cased
Architecture: BERT with a linear classification head
Task type: Multi-label classification
Loss Function: Binary Cross Entropy with Logits
Training Data: Custom Turkish dataset with personality trait annotations (e.g., CVs, social texts)

📈 Performance

Model was evaluated on a held-out portion of the dataset. Replace below with your real metrics:

Metric	Value
Accuracy	0.92
F1-Score	0.94
Precision	0.91
Recall	0.96

🔍 Applications

CV analysis and candidate profiling
Smart recruiting and HR systems
Social media or forum persona evaluation
Turkish personality-aware recommendation systems

📁 Files Included

pytorch_model.bin – fine-tuned model weights
config.json – model configuration
tokenizer_config.json, vocab.txt – tokenizer files

🤝 Acknowledgments

This project builds upon dbmdz/bert-base-turkish-cased. Thanks to the Turkish NLP community for contributions and datasets.

📬 Contact

If you have questions or suggestions, feel free to open an issue on the model page or contact the author.