File size: 4,892 Bytes
0073971 b941577 0073971 83bf5d0 0073971 b1abe79 0073971 b1abe79 0073971 b1abe79 0073971 b1abe79 0073971 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | ---
license: mit
tags:
- fairness
- classification
metrics:
- accuracy
papers:
- https://arxiv.org/abs/2507.20708
---
# Exposing the Illusion of Fairness (EIF): Biased models which results were later fairwashed
## π Overview
This repository contains a collection of neural network models trained on seven tabular datasets for the study:
**Exposing the Illusion of Fairness (EIF): Auditing Vulnerabilities to Distributional Manipulation Attacks** <br/>
https://arxiv.org/abs/2507.20708
Codebase: <br/>
https://github.com/ValentinLafargue/Inspection
Results: <br/>
https://huggingface.co/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions
Each model corresponds to a specific dataset and is designed to analyze fairness properties rather than maximize predictive performance.
## π§ Model Description
All models are **multilayer perceptrons (MLPs)** trained on tabular data.
- Fully connected neural networks
- Hidden layers: configurable (`n_loop`, `n_nodes`)
- Activation: ReLU (optional)
- Output: Sigmoid
- Prediction: $\hat{Y} \in [0,1]$
## π Datasets, Sensitive Attributes, and Disparate Impact
| Dataset | Adult[1] | INC[2] | TRA[2] | MOB[2] | BAF[3] | EMP[2] | PUC[2] |
|--------|------|-----|-----|-----|-----|-----|-----|
| **Sensitive Attribute (S)** | Sex | Sex | Sex | Age | Age | Disability | Disability |
| **Disparate Impact (DI)** | 0.30 | 0.67 | 0.69 | 0.45 | 0.35 | 0.30 | 0.32 |
```
[1]: Becker, B. and Kohavi, R. (1996). Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.306,
https://www.kaggle.com/datasets/uciml/adult-census-income.
[2]: Ding, F., Hardt, M., Miller, J., and Schmidt, L. (2021). Retiring adult: New datasets for fair machine learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems.313,
https://github.com/socialfoundations/folktables.
[3]: Jesus, S., Pombal, J., Alves, D., Cruz, A., Saleiro, P., Ribeiro, R. P., Gama, J., and Bizarro, P. (2022). Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. In Advances in Neural Information Processing Systems,
https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022.
```
### Notes
- Adult dataset: 5,000 test samples
- Other datasets: 20,000 test samples
- Sensitive attributes are used for fairness evaluation
### Results and manipulated results
The results obtained on the tests samples, and their fairwashed counterparts are directly available on [Hugging Face](https://huggingface.co/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions).
## π Predictive Performance (Accuracy)
| Dataset | Accuracy |
|--------|----------|
| Adult Census Income | 84% |
| Folktables Income (INC) | 88% |
| Folktables Mobility (MOB) | 84% |
| Folktables Employment (EMP) | 77% |
| Folktables Travel Time (TRA) | 72% |
| Folktables Public Coverage (PUC) | 73% |
| Bank Account Fraud (BAF) | 98% |
**Note:** High performance on BAF is due to strong class imbalance.
Accuracy was **not the main objective** of this study.
## π― Intended Use
These models are intended for:
- Fairness analysis
- Studying disparate impact and bias
- Reproducing results from the EIF paper
- Benchmarking fairness-aware methods
## β οΈ Limitations and Non-Intended Use
- Not designed for production
- Not optimized for predictive performance
- Should not be used for real-world decision-making
These models intentionally expose biases in standard ML pipelines.
## βοΈ Ethical Considerations
This work highlights:
- The presence of bias in machine learning models
- The limitations of fairness metrics
Models should be interpreted as **analytical tools**, not fair systems.
## π¦ Repository Structure
Each dataset corresponds to a subfolder:
EIF-biased-classifier/ <br/>
βββ ASC_ADULT_model/<br/>
βββ ASC_INC_model/<br/>
βββ ASC_MOB_model/<br/>
βββ ASC_EMP_model/<br/>
βββ ASC_TRA_model/<br/>
βββ ASC_PUC_model/<br/>
βββ ASC_BAF_model/<br/>
Each folder contains:
- `config.json`
- `model.safetensors`
## π Usage
```python
model = Network.from_pretrained(
"ValentinLAFARGUE/EIF-biased-classifier",
subfolder="ASC_INC_model"
)
```
## π Citation
```
@misc{lafargue2026exposingillusionfairnessauditing,
title={Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks},
author={Valentin Lafargue and Adriana Laurindo Monteiro and Emmanuelle Claeys and Laurent Risser and Jean-Michel Loubes},
year={2026},
eprint={2507.20708},
url={https://arxiv.org/abs/2507.20708},
}
```
## π Additional Notes
- Models are intentionally simple to isolate fairness behavior
- Results depend on preprocessing and sampling choices
- Focus is on reproducibility
|