MUR55's picture
Update README.md
7fb27ca verified
metadata
language:
  - tr
base_model:
  - dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
tags:
  - text-classification
  - multi-label-classification
  - personality
  - bert
  - pytorch
  - transformers
  - turkish
  - classification
  - human-resources
  - custom-trained
license: apache-2.0

bert_turkish_personality_analysis

This repository hosts a Turkish BERT model fine-tuned for multi-label personality trait classification. Built on top of dbmdz/bert-base-turkish-cased, this model predicts psychological and professional personality traits from Turkish text input.

🎯 Task: Multi-label Personality Trait Detection

Given a CV, personal statement, or written expression, the model assigns zero or more traits from the following set:

🏷️ Supported Labels

  • özgüvenli – confident
  • içe kapanık – introverted
  • lider – leader
  • takım oyuncusu – team player
  • kararsız – indecisive
  • abartılı – exaggerated
  • profesyonel – professional
  • deneyimli – experienced

The model supports multi-label classification using a sigmoid activation and thresholding logic.

🔧 Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model_name = "MUR55/bert_turkish_personality_analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Sample text
text = "5 yıllık yöneticilik tecrübemle liderlik becerilerimi geliştirdim, aynı zamanda ekip çalışmalarına önem veririm."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)

# Threshold to determine label presence
threshold = 0.5
labels = ["özgüvenli", "içe kapanık", "lider", "takım oyuncusu", "kararsız", "abartılı", "profesyonel", "deneyimli"]
predicted = [label for label, prob in zip(labels, probs[0]) if prob >= threshold]

print("Predicted traits:", predicted)

🧠 Model Details

  • Base model: dbmdz/bert-base-turkish-cased
  • Architecture: BERT with a linear classification head
  • Task type: Multi-label classification
  • Loss Function: Binary Cross Entropy with Logits
  • Training Data: Custom Turkish dataset with personality trait annotations (e.g., CVs, social texts)

📈 Performance

Model was evaluated on a held-out portion of the dataset. Replace below with your real metrics:

Metric Value
Accuracy 0.92
F1-Score 0.94
Precision 0.91
Recall 0.96

🔍 Applications

  • CV analysis and candidate profiling
  • Smart recruiting and HR systems
  • Social media or forum persona evaluation
  • Turkish personality-aware recommendation systems

📁 Files Included

  • pytorch_model.bin – fine-tuned model weights
  • config.json – model configuration
  • tokenizer_config.json, vocab.txt – tokenizer files

🤝 Acknowledgments

This project builds upon dbmdz/bert-base-turkish-cased. Thanks to the Turkish NLP community for contributions and datasets.

📬 Contact

If you have questions or suggestions, feel free to open an issue on the model page or contact the author.