Spaces:
Running
Running
| title: MIC Error Analysis | |
| emoji: π | |
| colorFrom: red | |
| colorTo: yellow | |
| sdk: static | |
| pinned: false | |
| # MIC Error Analysis β 30 cases | |
| Interactive viewer for 30 sampled errors of the MIC model (Ours-SFT-GRPO) on the TARABench test splits, grouped into three failure modes: | |
| - **Mode A** β Perceptually subtle / locally-plausible edits (verdict miss) | |
| - **Mode B** β Hallucinated visual grounding (verdict right, evidence fabricated) | |
| - **Mode C** β Misidentified entity origin (right object, wrong country/era) | |
| Open `index.html` for the interactive viewer. | |