File size: 2,699 Bytes
61bd26b
 
 
 
 
329438d
 
 
a344f41
329438d
0645099
329438d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61bd26b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
license: mit
language:
- en
---
# πŸͺœ LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers

[![Project](https://img.shields.io/badge/Project-%23dfb317)](https://shantanu-ai.github.io/projects/ACL-2025-Ladder/index.html)
[![Paper](https://img.shields.io/badge/Paper-ACL%202025-%23dfb317)](https://aclanthology.org/2025.findings-acl.1177/)
[![Code](https://img.shields.io/badge/GitHub-batmanlab%2FLADDER-%2312100e)](https://github.com/batmanlab/Ladder)
[![Model](https://img.shields.io/badge/HuggingFace-Pretrained--Checkpoints-blue)](https://huggingface.co/shawn24/Ladder/tree/main)

---

## πŸ“Œ Summary

**LADDER** is a general framework that enables vision classifiers to automatically discover subpopulations (or "slices") of data where the model is underperforming β€” without requiring group annotations. It leverages **vision-language representations** and the **reasoning capabilities of large language models (LLMs)** to detect and rectify bias-inducing features in both natural and medical imaging domains.

---

## 🧠 Architecture & Components

- πŸ” **Slice Discovery** using:
  - CLIP, Mammo-CLIP, and CXR-CLIP features
  - BLIP and GPT-4o-generated captions
- 🧠 **Hypothesis Generation** using:
  - GPT-4o, Claude, Gemini, LLaMA
- βœ… **Bias Mitigation** via reweighting & pseudo-labeling

---

## πŸ“Š Datasets Used

- **Natural Images**: Waterbirds, CelebA, MetaShift
- **Medical Images**: NIH ChestX-ray, RSNA Mammograms, VinDr Mammograms

---

## πŸ“¦ Files Included

| File | Description |
|------|-------------|
| `model.pt` | Pretrained model checkpoint |
| `feature_cache.pkl` | Cached representations (CLIP/Mammo-CLIP/CXR-CLIP) |
| `metadata.csv` | Metadata with discovered slice labels |
| `caption_blip.json` | BLIP-generated captions |
| `caption_gpt4o.json` | GPT-4o-generated captions |
| `predictions.json` | Model predictions on test set |

---


## πŸ§ͺ Benchmarks

LADDER outperforms traditional slice discovery methods (Domino, FACTS) across 6 datasets and >200 classifiers. It is especially effective in:

- Discovering hidden biases without explicit attribute labels
- Reasoning about non-visual factors (e.g., preprocessing artifacts)
- Operating without human-written captions

---

## πŸ“œ Citation

```bibtex
@article{ghosh2024ladder,
  title={LADDER: Language Driven Slice Discovery and Error Rectification},
  author={Ghosh, Shantanu and Syed, Rayan and Wang, Chenyu and Poynton, Clare B and Visweswaran, Shyam and Batmanghelich, Kayhan},
  journal={arXiv preprint arXiv:2408.07832},
  year={2024}
}
```

---

## 🀝 Acknowledgements

Boston University, Stanford University, BUMC, and the University of Pittsburgh.