future7 commited on
Commit
b7b6a6a
·
verified ·
1 Parent(s): 3a2d21e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text faithfulness
4
+ - hallucination detection
5
+ - RAG evaluation
6
+ - cognitive statements
7
+ - factual consistency
8
+ datasets:
9
+ - future7/CogniBench
10
+ - future7/CogniBench-L
11
+ language:
12
+ - en
13
+ base_model:
14
+ - meta-llama/Meta-Llama-3-8B
15
+ ---
16
+
17
+
18
+ # CogniDet: Cognitive Faithfulness Detector for LLMs
19
+
20
+ **CogniDet** is a state-of-the-art model for detecting **both factual and cognitive hallucinations** in Large Language Model (LLM) outputs. Developed as part of the [CogniBench](https://github.com/FUTUREEEEEE/CogniBench) framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation.
21
+
22
+ ## Key Features ✨
23
+ 1. **Dual Detection Capability**
24
+ Identifies both:
25
+ - **Factual Hallucinations** (claims contradicting provided context)
26
+ - **Cognitive Hallucinations** (unsupported inferences/evaluations)
27
+
28
+ 2. **Legal-Inspired Rigor**
29
+ Incorporates a tiered evaluation framework (Rational → Grounded → Unequivocal) inspired by legal evidence standards
30
+
31
+ 3. **Efficient Inference**
32
+ Single-pass detection with **8B parameter Llama3 backbone** (faster than NLI-based methods)
33
+
34
+ 4. **Large-Scale Training**
35
+ Trained on **CogniBench-L** (24k+ dialogues, 234k+ annotated sentences)
36
+
37
+ ## Performance 🚀
38
+ | Detection Type | F1 Score |
39
+ |----------------------|----------|
40
+ | **Overall** | 70.30 |
41
+ | Factual Hallucination| 64.40 |
42
+ | **Cognitive Hallucination** | **73.80** |
43
+
44
+ *Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)*
45
+
46
+ ## Usage 💻
47
+ ```python
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
+
50
+ model_id = "future7/CogniDet"
51
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
52
+ model = AutoModelForCausalLM.from_pretrained(model_id)
53
+
54
+ def detect_hallucinations(context, response):
55
+ inputs = tokenizer(
56
+ f"CONTEXT: {context}\nRESPONSE: {response}\nHALLUCINATIONS:",
57
+ return_tensors="pt"
58
+ )
59
+ outputs = model.generate(**inputs, max_new_tokens=100)
60
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
61
+
62
+ # Example usage
63
+ context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring."
64
+ response = "In cold regions, Moringa can bloom twice yearly if grown indoors."
65
+
66
+ print(detect_hallucinations(context, response))
67
+ # Output: "Bloom frequency claims in cold regions are speculative"
68
+ ```
69
+
70
+ ## Training Data 🔬
71
+ Trained on **CogniBench-L** featuring:
72
+ - 7,058 knowledge-grounded dialogues
73
+ - 234,164 sentence-level annotations
74
+ - Balanced coverage across 15+ domains (Medical, Legal, etc.)
75
+ - Auto-labeled via rigorous pipeline (82.2% agreement with humans)
76
+
77
+ ## Limitations ⚠️
78
+ 1. Best performance on **English** knowledge-grounded dialogues
79
+ 2. Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning
80
+ 3. Context window limited to 8K tokens
81
+
82
+ ## Citation 📚
83
+ If you use CogniDet, please cite the CogniBench paper:
84
+ ```bibtex
85
+ @inproceedings{tang2025cognibench,
86
+ title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs},
87
+ author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du
88
+ and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong},
89
+ booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)},
90
+ year = {2025},
91
+ pages = {xxx--xxx}, % 添加页码范围
92
+ publisher = {Association for Computational Linguistics},
93
+ location = {Vienna, Austria},
94
+ url = {https://arxiv.org/abs/2505.20767},
95
+ archivePrefix = {arXiv},
96
+ eprint = {2505.20767},
97
+ primaryClass = {cs.CL}
98
+ }
99
+ ```
100
+
101
+ ## Resources 🔗
102
+ - [CogniBench GitHub](https://github.com/FUTUREEEEEE/CogniBench)