Imbalanced dataset

#2
by Tuminha - opened

Hi everyone

Any idea how we can handle imbalanced classes? I was wondering if there's a preset similar to those in CatBoost or XGBoost.

Thank you.

Generally no special preset is needed. If you would like to optimize for metrics such as balanced_accuracy or balanced_loss you should use "balance_probabilities=True" (https://github.com/PriorLabs/TabPFN/blob/main/src/tabpfn/classifier.py#L190) to optimize this metric. For scores like f1 that balance recall and precision in a more complex way, set eval_metric="f1" (https://github.com/PriorLabs/TabPFN/blob/main/src/tabpfn/classifier.py#L214).

Does this help?

Sign up or log in to comment