File size: 4,892 Bytes
0073971
 
 
 
 
 
 
b941577
 
0073971
 
83bf5d0
0073971
 
 
 
 
b1abe79
0073971
 
b1abe79
0073971
 
b1abe79
 
 
0073971
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b1abe79
 
 
 
 
0073971
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
license: mit
tags:
- fairness
- classification
metrics:
- accuracy
papers:
  - https://arxiv.org/abs/2507.20708
---

# Exposing the Illusion of Fairness (EIF): Biased models which results were later fairwashed

## πŸ“Œ Overview

This repository contains a collection of neural network models trained on seven tabular datasets for the study:

**Exposing the Illusion of Fairness (EIF): Auditing Vulnerabilities to Distributional Manipulation Attacks**  <br/>
https://arxiv.org/abs/2507.20708  

Codebase:   <br/>
https://github.com/ValentinLafargue/Inspection  

Results: <br/>
https://huggingface.co/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions

Each model corresponds to a specific dataset and is designed to analyze fairness properties rather than maximize predictive performance.

## 🧠 Model Description

All models are **multilayer perceptrons (MLPs)** trained on tabular data.

- Fully connected neural networks  
- Hidden layers: configurable (`n_loop`, `n_nodes`)  
- Activation: ReLU (optional)  
- Output: Sigmoid  
- Prediction: $\hat{Y} \in [0,1]$  

## πŸ“Š Datasets, Sensitive Attributes, and Disparate Impact

| Dataset | Adult[1] | INC[2] | TRA[2] | MOB[2] | BAF[3] | EMP[2] | PUC[2] |
|--------|------|-----|-----|-----|-----|-----|-----|
| **Sensitive Attribute (S)** | Sex | Sex | Sex | Age | Age | Disability | Disability |
| **Disparate Impact (DI)** | 0.30 | 0.67 | 0.69 | 0.45 | 0.35 | 0.30 | 0.32 |
```
[1]: Becker, B. and Kohavi, R. (1996). Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.306,
https://www.kaggle.com/datasets/uciml/adult-census-income.

[2]: Ding, F., Hardt, M., Miller, J., and Schmidt, L. (2021). Retiring adult: New datasets for fair machine learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems.313,
https://github.com/socialfoundations/folktables.

[3]: Jesus, S., Pombal, J., Alves, D., Cruz, A., Saleiro, P., Ribeiro, R. P., Gama, J., and Bizarro, P. (2022). Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. In Advances in Neural Information Processing Systems,
https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022.
```

### Notes
- Adult dataset: 5,000 test samples  
- Other datasets: 20,000 test samples  
- Sensitive attributes are used for fairness evaluation  

### Results and manipulated results

The results obtained on the tests samples, and their fairwashed counterparts are directly available on [Hugging Face](https://huggingface.co/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions).


## πŸ“ˆ Predictive Performance (Accuracy)

| Dataset | Accuracy |
|--------|----------|
| Adult Census Income | 84% |
| Folktables Income (INC) | 88% |
| Folktables Mobility (MOB) | 84% |
| Folktables Employment (EMP) | 77% |
| Folktables Travel Time (TRA) | 72% |
| Folktables Public Coverage (PUC) | 73% |
| Bank Account Fraud (BAF) | 98% |

**Note:** High performance on BAF is due to strong class imbalance.  
Accuracy was **not the main objective** of this study.

## 🎯 Intended Use

These models are intended for:

- Fairness analysis  
- Studying disparate impact and bias  
- Reproducing results from the EIF paper  
- Benchmarking fairness-aware methods  

## ⚠️ Limitations and Non-Intended Use

- Not designed for production  
- Not optimized for predictive performance  
- Should not be used for real-world decision-making  

These models intentionally expose biases in standard ML pipelines.

## βš–οΈ Ethical Considerations

This work highlights:
- The presence of bias in machine learning models  
- The limitations of fairness metrics  

Models should be interpreted as **analytical tools**, not fair systems.

## πŸ“¦ Repository Structure

Each dataset corresponds to a subfolder:

EIF-biased-classifier/ <br/>
 β”œβ”€β”€ ASC_ADULT_model/<br/>
 β”œβ”€β”€ ASC_INC_model/<br/>
 β”œβ”€β”€ ASC_MOB_model/<br/>
 β”œβ”€β”€ ASC_EMP_model/<br/>
 β”œβ”€β”€ ASC_TRA_model/<br/>
 β”œβ”€β”€ ASC_PUC_model/<br/>
 └── ASC_BAF_model/<br/>

Each folder contains:
- `config.json`
- `model.safetensors`

## πŸš€ Usage

```python
model = Network.from_pretrained(
    "ValentinLAFARGUE/EIF-biased-classifier",
    subfolder="ASC_INC_model"
)
```

## πŸ“š Citation

```
@misc{lafargue2026exposingillusionfairnessauditing,
      title={Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks}, 
      author={Valentin Lafargue and Adriana Laurindo Monteiro and Emmanuelle Claeys and Laurent Risser and Jean-Michel Loubes},
      year={2026},
      eprint={2507.20708},
      url={https://arxiv.org/abs/2507.20708}, 
}
```

## πŸ” Additional Notes

- Models are intentionally simple to isolate fairness behavior
- Results depend on preprocessing and sampling choices
- Focus is on reproducibility