File size: 4,319 Bytes
3557421
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3e9b0f5
3557421
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
license: mit
language:
- en
metrics:
- precision
- recall
- f1
- accuracy
- roc_auc
pipeline_tag: tabular-classification
tags:
- classification
- psychology
---

# Model Card for Infinitode/BPPM-OPEN-ARC

Repository: https://github.com/Infinitode/OPEN-ARC/

## Model Description

OPEN-ARC-BPP is a simple XGBClassifier model developed as part of Infinitode's OPEN-ARC initiative. It was designed to determine a person's basic personality type (introvert/extrovert) based on several personal questions.

**Architecture**:

- **XGBClassifer**: `n_estimators=200`, `learning_rate=0.1`, `max_depth=4`, `eval_metric="logloss"`, `use_label_encoder=False`, `random_state=42`.
- **Framework**: XGBoost
- **Training Setup**: Trained using selected params.

## Uses

- Identifying a person's basic personality type through a series of personal questions.
- Enhancing knowledge and research in psychology and human behavior.

## Limitations

- Might lead to inaccurate evaluations of personality types because of exceptions and inconsistencies.

## Training Data

- Dataset: Personality Dataset (introvert or Extrovert) dataset from Kaggle.
- Source URL: https://www.kaggle.com/datasets/hardikchhipa28/personality-dataset-introvert-or-extrovert
- Content: Personality-determining features such as hours spent alone per day, etc.
- Size: The size is unknown as the dataset is no longer publicly available.
- Preprocessing: Preprocessing involved techniques such as number and category imputation, one-hot encoding, and label encoding.

## Training Procedure

- Metrics: accuracy, precision, recall, F1, ROC AUC
- Train/Testing Split: 75% train, 25% testing.

## Evaluation Results

| Metric | Value |
| ------ | ----- |
| Testing Accuracy | 92.0% |
| Testing Weighted Average Precision | 92% |
| Testing Weighted Average Recall | 92% |
| Testing Weighted Average F1 | 92% |
| Testing ROC AUC | 95.5% |

## How to Use

```python
import joblib
import pandas as pd
import numpy as np

# Load artifacts
art = joblib.load("personality_artifacts.pkl")
model          = art["model"]
num_imputer    = art["num_imputer"]
cat_imputer    = art["cat_imputer"]
ohe            = art["ohe"]
le             = art["label_encoder"]
num_cols       = art["num_cols"]
cat_cols       = art["cat_cols"]

def ask(prompt, cast=str, options=None):
    """Tiny helper to get clean input."""
    while True:
        try:
            val = cast(input(prompt))
            if options and val not in options:
                raise ValueError(f"Must be one of {options}")
            return val
        except Exception as e:
            print(f"Error: {e}. Try again.\n")

def predict_personality():
    print("\n🎭  Introvert vs Extrovert Predictor")
    print("Answer a few quick questions:\n")

    # Gather answers
    answers = {
        "Time_spent_Alone":       ask("Hours spent alone per day (0‑24): ", float),
        "Stage_fear":             ask("Stage fear? (Yes/No): ", str.title, ["Yes", "No"]),
        "Social_event_attendance":ask("Social events per week (0‑10): ", int),
        "Going_outside":          ask("Trips outside per day (0‑10): ", int),
        "Drained_after_socializing": ask("Feel drained after socializing? (Yes/No): ",
                                         str.title, ["Yes", "No"]),
        "Friends_circle_size":    ask("Number of close friends (0‑30): ", int),
        "Post_frequency":         ask("Social‑media posts per week (0‑30): ", int),
    }

    # Build one‑row DataFrame in correct column order
    row = pd.DataFrame([answers])[num_cols + cat_cols]

    # --- Re‑run exact preprocessing -------------------------------------------------
    X_num = num_imputer.transform(row[num_cols])
    X_cat = cat_imputer.transform(row[cat_cols])
    X_cat_enc = ohe.transform(X_cat)
    X_ready = np.hstack([X_num, X_cat_enc])

    # --- Predict --------------------------------------------------------------------
    proba = model.predict_proba(X_ready)[0]
    idx = proba.argmax()
    pred_label = le.inverse_transform([idx])[0]
    confidence = proba[idx]

    print(f"\n🔮 You are likely an **{pred_label}** (confidence {confidence:.0%}).")

predict_personality()
```

## Contact

For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.