| --- |
| license: mit |
| tags: |
| - fairness |
| - classification |
| metrics: |
| - accuracy |
| papers: |
| - https://arxiv.org/abs/2507.20708 |
| --- |
| |
| # EIF Biased Classifiers (Multi-Dataset Benchmark) |
|
|
| ## π Overview |
|
|
| This repository contains a collection of neural network models trained on seven tabular datasets for the study: |
|
|
| **Exposing the Illusion of Fairness (EIF): Auditing Vulnerabilities to Distributional Manipulation Attacks** |
| https://arxiv.org/abs/2507.20708 |
|
|
| Codebase: |
| https://github.com/ValentinLafargue/Inspection |
|
|
| Each model corresponds to a specific dataset and is designed to analyze fairness properties rather than maximize predictive performance. |
|
|
| ## π§ Model Description |
|
|
| All models are **multilayer perceptrons (MLPs)** trained on tabular data. |
|
|
| - Fully connected neural networks |
| - Hidden layers: configurable (`n_loop`, `n_nodes`) |
| - Activation: ReLU (optional) |
| - Output: Sigmoid |
| - Prediction: $\hat{Y} \in [0,1]$ |
|
|
| ## π Datasets, Sensitive Attributes, and Disparate Impact |
|
|
| | Dataset | Adult[1] | INC[2] | TRA[2] | MOB[2] | BAF[3] | EMP[2] | PUC[2] | |
| |--------|------|-----|-----|-----|-----|-----|-----| |
| | **Sensitive Attribute (S)** | Sex | Sex | Sex | Age | Age | Disability | Disability | |
| | **Disparate Impact (DI)** | 0.30 | 0.67 | 0.69 | 0.45 | 0.35 | 0.30 | 0.32 | |
| ``` |
| [1]: Becker, B. and Kohavi, R. (1996). Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.306, |
| https://www.kaggle.com/datasets/uciml/adult-census-income. |
| |
| [2]: Ding, F., Hardt, M., Miller, J., and Schmidt, L. (2021). Retiring adult: New datasets for fair machine learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems.313, |
| https://github.com/socialfoundations/folktables. |
| |
| [3]: Jesus, S., Pombal, J., Alves, D., Cruz, A., Saleiro, P., Ribeiro, R. P., Gama, J., and Bizarro, P. (2022). Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. In Advances in Neural Information Processing Systems, |
| https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022. |
| ``` |
|
|
| ### Notes |
| - Adult dataset: 5,000 test samples |
| - Other datasets: 20,000 test samples |
| - Sensitive attributes are used for fairness evaluation |
|
|
| ## π Predictive Performance (Accuracy) |
|
|
| | Dataset | Accuracy | |
| |--------|----------| |
| | Adult Census Income | 84% | |
| | Folktables Income (INC) | 88% | |
| | Folktables Mobility (MOB) | 84% | |
| | Folktables Employment (EMP) | 77% | |
| | Folktables Travel Time (TRA) | 72% | |
| | Folktables Public Coverage (PUC) | 73% | |
| | Bank Account Fraud (BAF) | 98% | |
|
|
| **Note:** High performance on BAF is due to strong class imbalance. |
| Accuracy was **not the main objective** of this study. |
|
|
| ## π― Intended Use |
|
|
| These models are intended for: |
|
|
| - Fairness analysis |
| - Studying disparate impact and bias |
| - Reproducing results from the EIF paper |
| - Benchmarking fairness-aware methods |
|
|
| ## β οΈ Limitations and Non-Intended Use |
|
|
| - Not designed for production |
| - Not optimized for predictive performance |
| - Should not be used for real-world decision-making |
|
|
| These models intentionally expose biases in standard ML pipelines. |
|
|
| ## βοΈ Ethical Considerations |
|
|
| This work highlights: |
| - The presence of bias in machine learning models |
| - The limitations of fairness metrics |
|
|
| Models should be interpreted as **analytical tools**, not fair systems. |
|
|
| ## π¦ Repository Structure |
|
|
| Each dataset corresponds to a subfolder: |
|
|
| EIF-biased-classifier/ <br/> |
| βββ ASC_ADULT_model/<br/> |
| βββ ASC_INC_model/<br/> |
| βββ ASC_MOB_model/<br/> |
| βββ ASC_EMP_model/<br/> |
| βββ ASC_TRA_model/<br/> |
| βββ ASC_PUC_model/<br/> |
| βββ ASC_BAF_model/<br/> |
|
|
| Each folder contains: |
| - `config.json` |
| - `model.safetensors` |
|
|
| ## π Usage |
|
|
| ```python |
| model = Network.from_pretrained( |
| "ValentinLAFARGUE/EIF-biased-classifier", |
| subfolder="ASC_INC_model" |
| ) |
| ``` |
|
|
| ## π Citation |
|
|
| ``` |
| @misc{lafargue2026exposingillusionfairnessauditing, |
| title={Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks}, |
| author={Valentin Lafargue and Adriana Laurindo Monteiro and Emmanuelle Claeys and Laurent Risser and Jean-Michel Loubes}, |
| year={2026}, |
| eprint={2507.20708}, |
| url={https://arxiv.org/abs/2507.20708}, |
| } |
| ``` |
|
|
| ## π Additional Notes |
|
|
| - Models are intentionally simple to isolate fairness behavior |
| - Results depend on preprocessing and sampling choices |
| - Focus is on reproducibility |
|
|