|
|
--- |
|
|
base_model: |
|
|
- meta-llama/Meta-Llama-3-8B |
|
|
datasets: |
|
|
- future7/CogniBench |
|
|
- future7/CogniBench-L |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- text faithfulness |
|
|
- hallucination detection |
|
|
- RAG evaluation |
|
|
- cognitive statements |
|
|
- factual consistency |
|
|
--- |
|
|
|
|
|
# CogniDet: Cognitive Faithfulness Detector for LLMs |
|
|
|
|
|
**CogniDet** is a state-of-the-art model for detecting **both factual and cognitive hallucinations** in Large Language Model (LLM) outputs. Developed as part of the [CogniBench](https://github.com/FUTUREEEEEE/CogniBench) framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation. The model is presented in the paper [CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models](https://huggingface.co/papers/2505.20767). |
|
|
|
|
|
## Key Features β¨ |
|
|
1. **Dual Detection Capability** |
|
|
Identifies both: |
|
|
- **Factual Hallucinations** (claims contradicting provided context) |
|
|
- **Cognitive Hallucinations** (unsupported inferences/evaluations) |
|
|
|
|
|
2. **Legal-Inspired Rigor** |
|
|
Incorporates a tiered evaluation framework (Rational β Grounded β Unequivocal) inspired by legal evidence standards |
|
|
|
|
|
3. **Efficient Inference** |
|
|
Single-pass detection with **8B parameter Llama3 backbone** (faster than NLI-based methods) |
|
|
|
|
|
4. **Large-Scale Training** |
|
|
Trained on **CogniBench-L** (24k+ dialogues, 234k+ annotated sentences) |
|
|
|
|
|
## Performance π |
|
|
| Detection Type | F1 Score | |
|
|
|----------------------|----------| |
|
|
| **Overall** | 70.30 | |
|
|
| Factual Hallucination| 64.40 | |
|
|
| **Cognitive Hallucination** | **73.80** | |
|
|
|
|
|
*Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)* |
|
|
|
|
|
## Usage π» |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
model_id = "future7/CogniDet" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
|
|
def detect_hallucinations(context, response): |
|
|
inputs = tokenizer( |
|
|
f"CONTEXT: {context} |
|
|
RESPONSE: {response} |
|
|
HALLUCINATIONS:", |
|
|
return_tensors="pt" |
|
|
) |
|
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
# Example usage |
|
|
context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring." |
|
|
response = "In cold regions, Moringa can bloom twice yearly if grown indoors." |
|
|
|
|
|
print(detect_hallucinations(context, response)) |
|
|
# Output: "Bloom frequency claims in cold regions are speculative" |
|
|
``` |
|
|
|
|
|
## Training Data π¬ |
|
|
Trained on **CogniBench-L** featuring: |
|
|
- 7,058 knowledge-grounded dialogues |
|
|
- 234,164 sentence-level annotations |
|
|
- Balanced coverage across 15+ domains (Medical, Legal, etc.) |
|
|
- Auto-labeled via rigorous pipeline (82.2% agreement with humans) |
|
|
|
|
|
## Limitations β οΈ |
|
|
1. Best performance on **English** knowledge-grounded dialogues |
|
|
2. Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning |
|
|
3. Context window limited to 8K tokens |
|
|
|
|
|
## Citation π |
|
|
If you use CogniDet, please cite the CogniBench paper: |
|
|
```bibtex |
|
|
@inproceedings{tang2025cognibench, |
|
|
title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs}, |
|
|
author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du |
|
|
and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong}, |
|
|
booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)}, |
|
|
year = {2025}, |
|
|
pages = {xxx--xxx}, % ζ·»ε ι‘΅η θε΄ |
|
|
publisher = {Association for Computational Linguistics}, |
|
|
location = {Vienna, Austria}, |
|
|
url = {https://arxiv.org/abs/2505.20767}, |
|
|
archivePrefix = {arXiv}, |
|
|
eprint = {2505.20767}, |
|
|
primaryClass = {cs.CL} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Resources π |
|
|
- [CogniBench GitHub](https://github.com/FUTUREEEEEE/CogniBench) |