mic-error-analysis / README.md
zengrh3's picture
Upload folder using huggingface_hub
64286bc verified
---
title: MIC Error Analysis
emoji: πŸ”
colorFrom: red
colorTo: yellow
sdk: static
pinned: false
---
# MIC Error Analysis β€” 30 cases
Interactive viewer for 30 sampled errors of the MIC model (Ours-SFT-GRPO) on the TARABench test splits, grouped into three failure modes:
- **Mode A** β€” Perceptually subtle / locally-plausible edits (verdict miss)
- **Mode B** β€” Hallucinated visual grounding (verdict right, evidence fabricated)
- **Mode C** β€” Misidentified entity origin (right object, wrong country/era)
Open `index.html` for the interactive viewer.