Spaces:
Running
Running
| title: README | |
| emoji: 😻 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: static | |
| pinned: false | |
| ## About | |
| **SynthFairCLIP** is a research initiative focused on fair vision–language models. | |
| We study how to reduce bias in CLIP-style models by combining: | |
| - **Real data** from large-scale datasets such as DataComp/CommonPool. | |
| - **Synthetic data** generated with state-of-the-art diffusion models. | |
| - **Curation and balancing** of demographic attributes across professions, activities and contexts. | |
| --- | |
| ### What we release | |
| - **CLIP models** trained on hybrid real–synthetic data. | |
| - **Large-scale WebDataset shards** of synthetic / hybrid image–text data. | |
| - **Eval tools and benchmarks** for analysing bias and fairness in CLIP-like models. | |
| [](https://github.com/lluisgomez/SynthFairCLIP/tree/main/evals) | |
| If you use our resources, please consider citing the SynthFairCLIP project. | |
| --- | |
| ### Acknowledgement | |
| We acknowledge EuroHPC JU for awarding the project ID EHPC-AI-2024A02-040 access to MareNostrum 5 hosted at BSC-CNS. |