ModernBERT Tool-Calling Hallucination Detector

This model is a task-specific ModernBERT token classifier trained on a ToolACE-derived synthetic hallucination dataset.

Dataset: https://huggingface.co/datasets/marrita/toolace-tool-calling-hallucination-ragtruth

The model receives a user query, tool output, and final answer. The training loss is applied only to answer tokens. Tokens overlapping hallucinated spans are labeled as positive; other answer tokens are labeled as negative; query/context tokens are masked.

Intended use

The model is intended for the course assignment Hallucination Detection in Tool Calling and for controlled evaluation on the accompanying synthetic dataset.

Evaluation

The main evaluation metric is sentence-level macro-F1, with additional balanced accuracy, ROC-AUC, PR-AUC, and span-level character F1. Detailed results are stored in evaluation_summary.json.

Limitations

The model is trained on automatically corrupted examples. It should be interpreted as a detector for this controlled benchmark rather than a general-purpose hallucination detector for all real-world tool-calling systems.

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32