README / README.md
lluisgomez's picture
Update README.md
d08621a verified
metadata
title: README
emoji: 😻
colorFrom: blue
colorTo: purple
sdk: static
pinned: false

About

SynthFairCLIP is a research initiative focused on fair vision–language models.

We study how to reduce bias in CLIP-style models by combining:

  • Real data from large-scale datasets such as DataComp/CommonPool.
  • Synthetic data generated with state-of-the-art diffusion models.
  • Curation and balancing of demographic attributes across professions, activities and contexts.

What we release

  • CLIP models trained on hybrid real–synthetic data.
  • Large-scale WebDataset shards of synthetic / hybrid image–text data.
  • Eval tools and benchmarks for analysing bias and fairness in CLIP-like models.

GitHub – evaluation tools

If you use our resources, please consider citing the SynthFairCLIP project.


Acknowledgement

We acknowledge EuroHPC JU for awarding the project ID EHPC-AI-2024A02-040 access to MareNostrum 5 hosted at BSC-CNS.