Transformers
Italian
English
semantic-search
explainable-ai
faiss
ai-ethics
responsible-ai
llm
prompt-engineering
multimodal-ai
ai-transparency
ethical-intelligence
explainable-llm
cognitive-ai
ethical-ai
scientific-retrieval
modular-ai
memory-augmented-llm
trustworthy-ai
reasoning-engine
ai-alignment
next-gen-llm
thinking-machines
open-source-ai
explainability
ai-research
semantic audit
cognitive agent
human-centered-ai
Create failure_analysis
Browse files- benchmark/failure_analysis +56 -0
benchmark/failure_analysis
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
### Case 1 — Source Ambiguity
|
| 2 |
+
|
| 3 |
+
**Domain:** Medicine
|
| 4 |
+
**Task:** Explain the structure and functions of the integumentary system.
|
| 5 |
+
|
| 6 |
+
**Claim generated by the model:**
|
| 7 |
+
"The integration of dermatology with psychology and psychiatry represents a growing field that could lead to more holistic treatment approaches."
|
| 8 |
+
|
| 9 |
+
**Verification result:**
|
| 10 |
+
EPISTEMIC FAILURE
|
| 11 |
+
|
| 12 |
+
**Reason:**
|
| 13 |
+
The retrieved sources discuss psychological aspects of skin diseases but do not explicitly state the integration between dermatology, psychology, and psychiatry as a formal interdisciplinary field.
|
| 14 |
+
The model inferred a structured integration that is not directly present in the sources.
|
| 15 |
+
|
| 16 |
+
**Failure Category:**
|
| 17 |
+
Source Ambiguity
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
### Case 2 — Source Ambiguity
|
| 21 |
+
|
| 22 |
+
**Domain:** Law
|
| 23 |
+
**Task:** Information Society — description of an evolving legal landscape.
|
| 24 |
+
|
| 25 |
+
**Claim generated by the model:**
|
| 26 |
+
"The information society represents a fundamental concept for understanding contemporary legal dynamics."
|
| 27 |
+
|
| 28 |
+
**Verification result:**
|
| 29 |
+
EPISTEMIC FAILURE
|
| 30 |
+
|
| 31 |
+
**Reason:**
|
| 32 |
+
The provided document describes the evolution of legal informatics and the expansion of topics related to digital technologies, but it **does not explicitly state** that the “information society” is a fundamental concept for understanding contemporary legal dynamics.
|
| 33 |
+
The model produced a plausible generalization that is **not supported** by any source in the corpus.
|
| 34 |
+
|
| 35 |
+
**Failure Category:**
|
| 36 |
+
Source Ambiguity
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
### Case 3 — Unauthorized Inference
|
| 41 |
+
|
| 42 |
+
**Domain:** Linguistics
|
| 43 |
+
**Task:** Explain the cognitive substrate of Specific Language Needs (Bisogni Linguistici Specifici).
|
| 44 |
+
|
| 45 |
+
**Claim generated by the model:**
|
| 46 |
+
"The use of teaching strategies focused on the mental representation of language may be more effective than traditional methods."
|
| 47 |
+
|
| 48 |
+
**Verification result:**
|
| 49 |
+
EPISTEMIC FAILURE
|
| 50 |
+
|
| 51 |
+
**Reason:**
|
| 52 |
+
The analyzed document discusses the glottodidactic potential of Cognitive Linguistics and mentions instructional applications that may support learners with Specific Language Needs. However, it **does not provide empirical evidence** or experimental studies demonstrating that mental‑representation‑based strategies are more effective than traditional methods.
|
| 53 |
+
The model converted a *theoretical proposal* into a *claim of proven effectiveness*, which is **not supported** by the corpus.
|
| 54 |
+
|
| 55 |
+
**Failure Category:**
|
| 56 |
+
Unauthorized Inference
|