Personality4rmText / README.md
Bur3hani's picture
Update README.md
3c1a273 verified
---
license: apache-2.0
---
---
license: mit
language: en
tags:
- sklearn
- text-classification
- psychology
- mbti
---
# MBTI Personality Predictor
This repository contains scikit-learn models for predicting MBTI personality types from text.
## Model Details
This system consists of a `TfidfVectorizer` and four separate `LogisticRegression` models, one for each of the MBTI dimensions:
* **Mind:** Introversion (I) vs. Extraversion (E)
* **Energy:** Intuition (N) vs. Sensing (S)
* **Nature:** Thinking (T) vs. Feeling (F)
* **Tactics:** Judging (J) vs. Perceiving (P)
## Intended Use
These models are intended for educational purposes and to demonstrate building an NLP classification system. They can be used to predict an MBTI type from a block of English text. **This is not a clinical or diagnostic tool.**
## Training Data
The models were trained on the [Myers-Briggs Personality Type Dataset](https://www.kaggle.com/datasets/datasnaek/mbti-type) from Kaggle, which contains over 8,600 entries of text from social media forums.
## Training Procedure
Text was cleaned by removing URLs and punctuation, lemmatizing, and removing stopwords. The text was then vectorized using TF-IDF (`max_features=5000`, `ngram_range=(1, 2)`). Each `LogisticRegression` model was trained with `class_weight='balanced'` to counteract the natural imbalance in the dataset.
### Evaluation Results
Average F1-Scores on the test set:
* **I/E Model:** Macro F1-Score: ~0.79
* **N/S Model:** Macro F1-Score: [Add Your Score]
* **F/T Model:** Macro F1-Score: [Add Your Score]
* **J/P Model:** Macro F1-Score: [Add Your Score]
## How to Use
```python
import joblib
from huggingface_hub import hf_hub_download
# Define the repo ID
repo_id = "YOUR_USERNAME/mbti-personality-predictor"
# Download all the model files
vectorizer = joblib.load(hf_hub_download(repo_id=repo_id, filename="mbti_vectorizer.joblib"))
model_ie = joblib.load(hf_hub_download(repo_id=repo_id, filename="mbti_model_ie.joblib"))
model_ns = joblib.load(hf_hub_download(repo_id=repo_id, filename="mbti_model_ns.joblib"))
model_ft = joblib.load(hf_hub_download(repo_id=repo_id, filename="mbti_model_ft.joblib"))
model_jp = joblib.load(hf_hub_download(repo_id=repo_id, filename="mbti_model_jp.joblib"))
# You can now use these objects for prediction...