MUR55
/

bert_turkish_personality_analysis

Text Classification

multi-label-classification

human-resources

Model card Files Files and versions

bert_turkish_personality_analysis / README.md

MUR55's picture

Update README.md

7fb27ca verified 8 months ago

|

history blame contribute delete

3.49 kB

	---
	language:
	- tr
	base_model:
	- dbmdz/bert-base-turkish-cased
	pipeline_tag: text-classification
	tags:
	- text-classification
	- multi-label-classification
	- personality
	- bert
	- pytorch
	- transformers
	- turkish
	- classification
	- human-resources
	- custom-trained
	license: apache-2.0
	---

	# bert\_turkish\_personality\_analysis

	This repository hosts a Turkish BERT model fine-tuned for multi-label personality trait classification.
	Built on top of `dbmdz/bert-base-turkish-cased`, this model predicts psychological and professional personality traits from Turkish text input.


	## 🎯 Task: Multi-label Personality Trait Detection

	Given a CV, personal statement, or written expression, the model assigns zero or more traits from the following set:

	### 🏷️ Supported Labels

	* `özgüvenli` – confident
	* `içe kapanık` – introverted
	* `lider` – leader
	* `takım oyuncusu` – team player
	* `kararsız` – indecisive
	* `abartılı` – exaggerated
	* `profesyonel` – professional
	* `deneyimli` – experienced

	The model supports multi-label classification using a sigmoid activation and thresholding logic.


	## 🔧 Usage Example

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load tokenizer and model
	model_name = "MUR55/bert_turkish_personality_analysis"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# Sample text
	text = "5 yıllık yöneticilik tecrübemle liderlik becerilerimi geliştirdim, aynı zamanda ekip çalışmalarına önem veririm."

	# Tokenize and predict
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
	outputs = model(**inputs)
	probs = torch.sigmoid(outputs.logits)

	# Threshold to determine label presence
	threshold = 0.5
	labels = ["özgüvenli", "içe kapanık", "lider", "takım oyuncusu", "kararsız", "abartılı", "profesyonel", "deneyimli"]
	predicted = [label for label, prob in zip(labels, probs[0]) if prob >= threshold]

	print("Predicted traits:", predicted)
	```


	## 🧠 Model Details

	* Base model: [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased)
	* Architecture: BERT with a linear classification head
	* Task type: Multi-label classification
	* Loss Function: Binary Cross Entropy with Logits
	* Training Data: Custom Turkish dataset with personality trait annotations (e.g., CVs, social texts)


	## 📈 Performance

	Model was evaluated on a held-out portion of the dataset. Replace below with your real metrics:

	\| Metric \| Value \|
	\| --------- \| ----- \|
	\| Accuracy \| 0.92 \|
	\| F1-Score \| 0.94 \|
	\| Precision \| 0.91 \|
	\| Recall \| 0.96 \|


	## 🔍 Applications

	* CV analysis and candidate profiling
	* Smart recruiting and HR systems
	* Social media or forum persona evaluation
	* Turkish personality-aware recommendation systems


	## 📁 Files Included

	* `pytorch_model.bin` – fine-tuned model weights
	* `config.json` – model configuration
	* `tokenizer_config.json`, `vocab.txt` – tokenizer files


	## 🤝 Acknowledgments

	This project builds upon [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased). Thanks to the Turkish NLP community for contributions and datasets.


	## 📬 Contact

	If you have questions or suggestions, feel free to open an issue on the [model page](https://huggingface.co/MUR55/bert_turkish_personality_analysis) or contact the author.