license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers
M3-AD: RA-Monitor
This repository contains the model weights for RA-Monitor, a unified reflection-aware multimodal framework for industrial anomaly detection. RA-Monitor is part of the M3-AD framework presented in the paper M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection.
Model Description
RA-Monitor addresses the issue where multimodal large language models (MLLMs) produce high-confidence but unreliable decisions in complex industrial scenarios. It introduces a reflection-aware mechanism that models reflection as a learnable decision revision process. This allows the model to perform controlled self-correction when initial judgments are unreliable, significantly improving anomaly type recognition and spatial localization.
The framework is built upon:
- RA-Monitor: A mechanism that equips pre-trained models with thinking and reflective abilities.
- M3-AD-FT: A dataset designed for reflection-aligned fine-tuning.
- M3-AD-Bench: A benchmark for systematic cross-category evaluation of industrial anomaly detection.
Resources
- Paper: M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection
- GitHub Repository: Yanhui-Lee/M3-AD
Citation
If you find this work useful, please cite the following paper:
@article{m3ad2026,
title={M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection},
author={Li, Yanhui and others},
journal={arXiv preprint arXiv:2603.00055},
year={2026}
}