shawn24
/

Ladder

+# 🪜 LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers
+[![Project](https://img.shields.io/badge/Project-%23dfb317)](https://shantanu-ai.github.io/projects/ACL-2025-Ladder/index.html)
+[![Paper](https://img.shields.io/badge/Paper-ACL%202025-%23dfb317)](https://arxiv.org/abs/2405.12255)
+[![Code](https://img.shields.io/badge/GitHub-batmanlab%2FLADDER-%2312100e)](https://github.com/batmanlab/Ladder)
+[![Model](https://img.shields.io/badge/HuggingFace-Pretrained--Checkpoints-blue)](https://huggingface.co/your-model-name)
+[![License](https://img.shields.io/badge/License-MIT-green)](./LICENSE)
+---
+## 📌 Summary
+**LADDER** is a general framework that enables vision classifiers to automatically discover subpopulations (or "slices") of data where the model is underperforming — without requiring group annotations. It leverages **vision-language representations** and the **reasoning capabilities of large language models (LLMs)** to detect and rectify bias-inducing features in both natural and medical imaging domains.
+---
+## 🧠 Architecture & Components
+- 🔍 **Slice Discovery** using:
+  - CLIP, Mammo-CLIP, and CXR-CLIP features
+  - BLIP and GPT-4o-generated captions
+- 🧠 **Hypothesis Generation** using:
+  - GPT-4o, Claude, Gemini, LLaMA
+- ✅ **Bias Mitigation** via reweighting & pseudo-labeling
+---
+## 📊 Datasets Used
+- **Natural Images**: Waterbirds, CelebA, MetaShift
+- **Medical Images**: NIH ChestX-ray, RSNA Mammograms, VinDr Mammograms
+---
+## 📦 Files Included
+| File | Description |
+|------|-------------|
+| `model.pt` | Pretrained model checkpoint |
+| `feature_cache.pkl` | Cached representations (CLIP/Mammo-CLIP/CXR-CLIP) |
+| `metadata.csv` | Metadata with discovered slice labels |
+| `caption_blip.json` | BLIP-generated captions |
+| `caption_gpt4o.json` | GPT-4o-generated captions |
+| `predictions.json` | Model predictions on test set |
+---
+## 🧪 Benchmarks
+LADDER outperforms traditional slice discovery methods (Domino, FACTS) across 6 datasets and >200 classifiers. It is especially effective in:
+- Discovering hidden biases without explicit attribute labels
+- Reasoning about non-visual factors (e.g., preprocessing artifacts)
+- Operating without human-written captions
+---
+## 📜 Citation
+```bibtex
+@article{ghosh2024ladder,
+  title={LADDER: Language Driven Slice Discovery and Error Rectification},
+  author={Ghosh, Shantanu and Syed, Rayan and Wang, Chenyu and Poynton, Clare B and Visweswaran, Shyam and Batmanghelich, Kayhan},
+  journal={arXiv preprint arXiv:2408.07832},
+  year={2024}
+}
+```
+---
+## 🤝 Acknowledgements
+Boston University, Stanford University, BUMC, and the University of Pittsburgh.