Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,28 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
## About
|
| 11 |
+
|
| 12 |
+
**SynthFairCLIP** is a research initiative focused on fair vision–language models.
|
| 13 |
+
|
| 14 |
+
We study how to reduce bias in CLIP-style models by combining:
|
| 15 |
+
|
| 16 |
+
- **Real data** from large-scale datasets such as DataComp/CommonPool.
|
| 17 |
+
- **Synthetic data** generated with state-of-the-art diffusion models.
|
| 18 |
+
- **Curation and balancing** of demographic attributes across professions, activities and contexts.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
### What we release
|
| 23 |
+
|
| 24 |
+
- **CLIP models** trained on hybrid real–synthetic data.
|
| 25 |
+
- **Large-scale WebDataset shards** of synthetic / hybrid image–text data.
|
| 26 |
+
- **Eval tools and benchmarks** for analysing bias and fairness in CLIP-like models.
|
| 27 |
+
|
| 28 |
+
If you use our resources, please consider citing the SynthFairCLIP project.
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
### Acknowledgement
|
| 33 |
+
|
| 34 |
+
We acknowledge EuroHPC JU for awarding the project ID EHPC-AI-2024A02-040 access to MareNostrum 5 hosted at BSC-CNS.
|