elly99 commited on
Commit
a7b01bb
·
verified ·
1 Parent(s): a1d2747

Create failure_analysis

Browse files
Files changed (1) hide show
  1. benchmark/failure_analysis +56 -0
benchmark/failure_analysis ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Case 1 — Source Ambiguity
2
+
3
+ **Domain:** Medicine
4
+ **Task:** Explain the structure and functions of the integumentary system.
5
+
6
+ **Claim generated by the model:**
7
+ "The integration of dermatology with psychology and psychiatry represents a growing field that could lead to more holistic treatment approaches."
8
+
9
+ **Verification result:**
10
+ EPISTEMIC FAILURE
11
+
12
+ **Reason:**
13
+ The retrieved sources discuss psychological aspects of skin diseases but do not explicitly state the integration between dermatology, psychology, and psychiatry as a formal interdisciplinary field.
14
+ The model inferred a structured integration that is not directly present in the sources.
15
+
16
+ **Failure Category:**
17
+ Source Ambiguity
18
+ ---
19
+
20
+ ### Case 2 — Source Ambiguity
21
+
22
+ **Domain:** Law
23
+ **Task:** Information Society — description of an evolving legal landscape.
24
+
25
+ **Claim generated by the model:**
26
+ "The information society represents a fundamental concept for understanding contemporary legal dynamics."
27
+
28
+ **Verification result:**
29
+ EPISTEMIC FAILURE
30
+
31
+ **Reason:**
32
+ The provided document describes the evolution of legal informatics and the expansion of topics related to digital technologies, but it **does not explicitly state** that the “information society” is a fundamental concept for understanding contemporary legal dynamics.
33
+ The model produced a plausible generalization that is **not supported** by any source in the corpus.
34
+
35
+ **Failure Category:**
36
+ Source Ambiguity
37
+
38
+ ---
39
+
40
+ ### Case 3 — Unauthorized Inference
41
+
42
+ **Domain:** Linguistics
43
+ **Task:** Explain the cognitive substrate of Specific Language Needs (Bisogni Linguistici Specifici).
44
+
45
+ **Claim generated by the model:**
46
+ "The use of teaching strategies focused on the mental representation of language may be more effective than traditional methods."
47
+
48
+ **Verification result:**
49
+ EPISTEMIC FAILURE
50
+
51
+ **Reason:**
52
+ The analyzed document discusses the glottodidactic potential of Cognitive Linguistics and mentions instructional applications that may support learners with Specific Language Needs. However, it **does not provide empirical evidence** or experimental studies demonstrating that mental‑representation‑based strategies are more effective than traditional methods.
53
+ The model converted a *theoretical proposal* into a *claim of proven effectiveness*, which is **not supported** by the corpus.
54
+
55
+ **Failure Category:**
56
+ Unauthorized Inference