GYASBGFUHAADSGADF commited on
Commit
aebc6c1
·
verified ·
1 Parent(s): 7701922

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +22 -2
README.md CHANGED
@@ -1,3 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # MathTrap300
2
 
3
  A benchmark dataset of 300 insolvable, ill-posed mathematical problems designed to evaluate large language models' ability to recognize mathematical insolvability and fundamental contradictions.
@@ -23,7 +34,7 @@ from datasets import load_dataset
23
  dataset = load_dataset("GYASBGFUHAADSGADF/mathtrap300")
24
 
25
  # Access the data
26
- for example in dataset['batch1']:
27
  print(f"Original: {example['original']}")
28
  print(f"Trap: {example['trap']}")
29
  print(f"Annotation: {example['annotation']}")
@@ -50,6 +61,15 @@ Our evaluation of recent advanced LLMs on MathTrap300 reveals:
50
  - Condition Neglect: Models ignore critical mathematical constraints
51
  - **Forced Solutions**: Even when models recognize insolvability, they still attempt to force a solution
52
 
 
 
 
 
 
 
 
 
 
53
  ## Citation
54
 
55
  If you use this dataset in your research, please cite our paper:
@@ -66,4 +86,4 @@ If you use this dataset in your research, please cite our paper:
66
 
67
  ## License
68
 
69
- This dataset is released under the MIT License.
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - mathematics
5
+ - education
6
+ - reasoning
7
+ - trap-questions
8
+ - math-problems
9
+ library_name: datasets
10
+ ---
11
+
12
  # MathTrap300
13
 
14
  A benchmark dataset of 300 insolvable, ill-posed mathematical problems designed to evaluate large language models' ability to recognize mathematical insolvability and fundamental contradictions.
 
34
  dataset = load_dataset("GYASBGFUHAADSGADF/mathtrap300")
35
 
36
  # Access the data
37
+ for example in dataset['train']:
38
  print(f"Original: {example['original']}")
39
  print(f"Trap: {example['trap']}")
40
  print(f"Annotation: {example['annotation']}")
 
61
  - Condition Neglect: Models ignore critical mathematical constraints
62
  - **Forced Solutions**: Even when models recognize insolvability, they still attempt to force a solution
63
 
64
+ ## Dataset Statistics
65
+
66
+ - **Total Problems**: 300 (currently 151 uploaded)
67
+ - **Difficulty Levels**: 1.0 - 5.0
68
+ - **Trap Types**: Contradiction, Missing Conditions, and others
69
+ - **Sources**: MATH dataset, Original creation
70
+ - **Validation**: Rigorously verified by PhD-level mathematical experts
71
+ - **Split**: Mix of train/test examples
72
+
73
  ## Citation
74
 
75
  If you use this dataset in your research, please cite our paper:
 
86
 
87
  ## License
88
 
89
+ This dataset is released under the MIT License.