RA-Monitor / README.md
nielsr's picture
nielsr HF Staff
Improve model card and metadata
1ce9e20 verified
|
raw
history blame
1.9 kB
metadata
license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers

M3-AD: RA-Monitor

This repository contains the model weights for RA-Monitor, a unified reflection-aware multimodal framework for industrial anomaly detection. RA-Monitor is part of the M3-AD framework presented in the paper M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection.

Model Description

RA-Monitor addresses the issue where multimodal large language models (MLLMs) produce high-confidence but unreliable decisions in complex industrial scenarios. It introduces a reflection-aware mechanism that models reflection as a learnable decision revision process. This allows the model to perform controlled self-correction when initial judgments are unreliable, significantly improving anomaly type recognition and spatial localization.

The framework is built upon:

  • RA-Monitor: A mechanism that equips pre-trained models with thinking and reflective abilities.
  • M3-AD-FT: A dataset designed for reflection-aligned fine-tuning.
  • M3-AD-Bench: A benchmark for systematic cross-category evaluation of industrial anomaly detection.

Resources

Citation

If you find this work useful, please cite the following paper:

@article{m3ad2026,
  title={M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection},
  author={Li, Yanhui and others},
  journal={arXiv preprint arXiv:2603.00055},
  year={2026}
}