arxiv:2412.14816

TextSleuth: Towards Explainable Tampered Text Detection

Published on Jan 15, 2025

Authors:

Abstract

A large-scale dataset and explainable tampered text detection model are presented, utilizing natural language explanations and multimodal approaches to improve reliability and interpretability.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Recently, tampered text detection has attracted increasing attention due to its essential role in information security. Although existing methods can detect the tampered text region, the interpretation of such detection remains unclear, making the prediction unreliable. To address this problem, we propose to explain the basis of tampered text detection with natural language via large multimodal models. To fill the data gap for this task, we propose a large-scale, comprehensive dataset, ETTD, which contains both pixel-level annotations for tampered text region and natural language annotations describing the anomaly of the tampered text. Multiple methods are employed to improve the quality of the proposed data. For example, elaborate queries are introduced to generate high-quality anomaly descriptions with GPT4o. A fused mask prompt is proposed to reduce confusion when querying GPT4o to generate anomaly descriptions. To automatically filter out low-quality annotations, we also propose to prompt GPT4o to recognize tampered texts before describing the anomaly, and to filter out the responses with low OCR accuracy. To further improve explainable tampered text detection, we propose a simple yet effective model called TextSleuth, which achieves improved fine-grained perception and cross-domain generalization by focusing on the suspected region, with a two-stage analysis paradigm and an auxiliary grounding prompt. Extensive experiments on both the ETTD dataset and the public dataset have verified the effectiveness of the proposed methods. In-depth analysis is also provided to inspire further research. Our dataset and code will be open-source.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2412.14816

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.14816 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.14816 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.14816 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.