File size: 228,954 Bytes
52da7b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
3682
3683
3684
3685
3686
3687
3688
3689
3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
3858
3859
3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
from __future__ import annotations

import json
import random
from dataclasses import dataclass
from pathlib import Path
from typing import Iterable


CURRICULUM_SOURCE = "okeymeta-curriculum-v1"
CURRICULUM_CATEGORIES = (
    "system_instruction_following",
    "safety_refusal",
    "chat",
    "reasoning_explanation",
    "math",
    "character_counting",
    "writing_email",
    "long_context_recall",
    "world_knowledge",
    "physical_reasoning",
    "tool_use_grounding",
    "coding_skills",
    "punctuation_prose",
    "story_generation",
    "emoji_communication",
    "reframr_identity",
)
REQUIRED_RECORD_FIELDS = (
    "id",
    "source",
    "split",
    "task_type",
    "safety_label",
    "language",
    "quality_score",
    "prompt",
    "answer",
    "reasoning_summary",
    "context",
    "text",
)


@dataclass(frozen=True, slots=True)
class CurriculumConfig:
    records_per_category: int = 1000
    seed: int = 7
    train_ratio: float = 0.92
    source: str = CURRICULUM_SOURCE
    language: str = "en"


def validate_curriculum_record(record: dict[str, object]) -> None:
    missing = [field for field in REQUIRED_RECORD_FIELDS if field not in record]
    if missing:
        raise ValueError(f"Curriculum record missing fields: {', '.join(missing)}")
    if record["source"] != CURRICULUM_SOURCE:
        raise ValueError("Curriculum record must use the OkeyMeta curriculum source.")
    if record["split"] not in {"train", "holdout"}:
        raise ValueError("Curriculum split must be 'train' or 'holdout'.")
    if record["task_type"] not in CURRICULUM_CATEGORIES:
        raise ValueError(f"Unknown curriculum task_type: {record['task_type']}")
    if record["language"] != "en":
        raise ValueError("Initial curriculum shard is English; add new shards for other languages.")
    quality = float(record["quality_score"])
    if quality < 0.0 or quality > 1.0:
        raise ValueError("quality_score must be between 0 and 1.")
    prompt = str(record["prompt"]).strip()
    answer = str(record["answer"]).strip()
    text = str(record["text"])
    if len(prompt.split()) < 3 and record["task_type"] != "chat":
        raise ValueError("Curriculum prompt is too short.")
    if len(answer.split()) < 8:
        raise ValueError("Curriculum answer is too short.")
    if "<reason>" not in text or "<answer>" not in text:
        raise ValueError("Curriculum text must expose REFRAMR control boundaries.")
    if prompt not in text or answer not in text:
        raise ValueError("Curriculum text must contain its prompt and answer.")


def build_curriculum_records(config: CurriculumConfig) -> list[dict[str, object]]:
    if config.records_per_category <= 0:
        raise ValueError("records_per_category must be positive.")
    if config.train_ratio <= 0.0 or config.train_ratio >= 1.0:
        raise ValueError("train_ratio must be between 0 and 1.")

    rng = random.Random(config.seed)
    records: list[dict[str, object]] = []
    builders = {
        "system_instruction_following": _build_system_instruction_following,
        "safety_refusal": _build_safety_refusal,
        "chat": _build_chat,
        "reasoning_explanation": _build_reasoning_explanation,
        "math": _build_math,
        "character_counting": _build_character_counting,
        "writing_email": _build_writing_email,
        "long_context_recall": _build_long_context_recall,
        "world_knowledge": _build_world_knowledge,
        "physical_reasoning": _build_physical_reasoning,
        "tool_use_grounding": _build_tool_use_grounding,
        "coding_skills": _build_coding_skills,
        "punctuation_prose": _build_punctuation_prose,
        "story_generation": _build_story_generation,
        "emoji_communication": _build_emoji_communication,
        "reframr_identity": _build_reframr_identity,
    }
    for index in range(config.records_per_category):
        for category in CURRICULUM_CATEGORIES:
            builder = builders[category]
            split = "train" if rng.random() < config.train_ratio else "holdout"
            record = builder(index, split, rng, config)
            validate_curriculum_record(record)
            records.append(record)
    return records


def write_curriculum_package(
    output_dir: str | Path,
    config: CurriculumConfig,
    *,
    effective_token_target: int | None = None,
) -> dict[str, object]:
    output = Path(output_dir)
    output.mkdir(parents=True, exist_ok=True)
    records = build_curriculum_records(config)
    corpus_path = output / f"{config.source}.jsonl"
    manifest_path = output / f"{config.source}-manifest.json"
    plan_path = output / f"{config.source}-plan.json"
    prompt_suite_path = output / f"{config.source}-holdout-prompts.jsonl"

    with corpus_path.open("w", encoding="utf-8", newline="\n") as handle:
        for record in records:
            handle.write(json.dumps(record, ensure_ascii=False, separators=(",", ":")) + "\n")

    section_counts = {
        category: sum(1 for record in records if record["task_type"] == category)
        for category in CURRICULUM_CATEGORIES
    }
    split_counts = {
        split: sum(1 for record in records if record["split"] == split)
        for split in ("train", "holdout")
    }
    token_count_estimate = sum(
        len(str(record["text"]).split())
        for record in records
    )
    plan_weight = 2.0
    if effective_token_target is not None and effective_token_target > 0:
        plan_weight = max(1.0, float(effective_token_target) / max(1, token_count_estimate))
    effective_token_count_estimate = int(round(token_count_estimate * plan_weight))
    manifest = {
        "name": config.source,
        "description": (
            "OkeyMeta Ltd JSON curriculum for Reframr: safety, chat, reasoning, "
            "math, character counting, writing, long context, world summaries, physical reasoning, tool use, coding, punctuation, stories, emoji communication, and identity."
        ),
        "schema": list(REQUIRED_RECORD_FIELDS),
        "records": len(records),
        "records_per_category": config.records_per_category,
        "token_count_estimate": token_count_estimate,
        "effective_token_count_estimate": effective_token_count_estimate,
        "plan_weight": round(plan_weight, 6),
        "section_counts": section_counts,
        "split_counts": split_counts,
        "corpus_path": str(corpus_path.resolve()),
        "prompt_suite_path": str(prompt_suite_path.resolve()),
    }
    manifest_path.write_text(
        json.dumps(manifest, indent=2, ensure_ascii=False) + "\n",
        encoding="utf-8",
    )

    plan = {
        "sources": [
            {
                "source": "file",
                "name": config.source,
                "path": str(corpus_path.resolve()),
                "weight": plan_weight,
                "min_words": 10,
                "max_words": 700,
                "min_alpha_ratio": 0.55,
                "allowed_languages": [config.language],
            }
        ]
    }
    plan_path.write_text(
        json.dumps(plan, indent=2, ensure_ascii=False) + "\n",
        encoding="utf-8",
    )

    with prompt_suite_path.open("w", encoding="utf-8", newline="\n") as handle:
        for record in records:
            if record["split"] == "holdout":
                prompt = {
                    "prompt": record["prompt"],
                    "tags": [record["task_type"], "holdout", config.source],
                    "variation_key": record["task_type"],
                    "max_tokens": 96,
                    "min_words": 12,
                    "require_punctuation": True,
                }
                handle.write(json.dumps(prompt, ensure_ascii=False, separators=(",", ":")) + "\n")
        for prompt in _tool_runtime_eval_prompts(config):
            handle.write(json.dumps(prompt, ensure_ascii=False, separators=(",", ":")) + "\n")

    return {
        "name": config.source,
        "records": len(records),
        "records_per_category": config.records_per_category,
        "token_count_estimate": token_count_estimate,
        "effective_token_count_estimate": effective_token_count_estimate,
        "plan_weight": round(plan_weight, 6),
        "corpus_path": str(corpus_path.resolve()),
        "manifest_path": str(manifest_path.resolve()),
        "plan_path": str(plan_path.resolve()),
        "prompt_suite_path": str(prompt_suite_path.resolve()),
        "section_counts": section_counts,
        "split_counts": split_counts,
    }


def _tool_runtime_eval_prompts(config: CurriculumConfig) -> list[dict[str, object]]:
    return [
        {
            "prompt": "What is the latest official update on the city weather alert?",
            "tags": ["tool_eval", "current_events", "requires_fresh_info", config.source],
            "variation_key": "tool-current-no-source",
            "allow_tool_call": True,
            "required_groups": [["current", "fresh"], ["tool", "source"], ["cannot", "guess"]],
            "max_tokens": 128,
            "min_words": 20,
            "require_punctuation": True,
        },
        {
            "prompt": "What changed in the library policy?",
            "system": "Use provided sources when they are available, and do not invent current facts.",
            "tool_results": [
                {
                    "name": "web.search",
                    "status": "ok",
                    "sources": [
                        {
                            "title": "Library Notice",
                            "url": "https://example.test/library-policy",
                            "snippet": "Weekend pickup moved to 4 PM and the reading room stayed open.",
                        }
                    ],
                }
            ],
            "tags": ["tool_eval", "source_grounded", "tool_result", config.source],
            "variation_key": "tool-library-source",
            "required_groups": [["Library Notice", "source"], ["4 PM"], ["pickup", "reading room"]],
            "max_tokens": 128,
            "min_words": 18,
            "require_punctuation": True,
        },
        {
            "prompt": "The search tool returned no result for the current market schedule. What should Reframr answer?",
            "tool_results": [
                {
                    "name": "web.search",
                    "status": "failed",
                    "error": "empty_result",
                }
            ],
            "tags": ["tool_eval", "no_source_refusal", "tool_failure", config.source],
            "variation_key": "tool-market-failure",
            "required_groups": [["tool", "search"], ["failed", "empty_result"], ["cannot", "source"]],
            "max_tokens": 128,
            "min_words": 20,
            "require_punctuation": True,
        },
        {
            "prompt": "Explain when an assistant should use a tool instead of answering from memory.",
            "tags": ["tool_eval", "broad_open_question", "generalization", config.source],
            "variation_key": "tool-use-policy",
            "required_groups": [["fresh", "current", "changed"], ["tool", "source"], ["memory", "stable"]],
            "max_tokens": 128,
            "min_words": 24,
            "require_punctuation": True,
        },
        {
            "prompt": "A tool timed out while checking a software release. Show the next safe move.",
            "tool_results": [
                {
                    "name": "web.search",
                    "status": "failed",
                    "error": "timeout",
                }
            ],
            "tags": ["tool_eval", "retry_after_failure", "source_grounding", config.source],
            "variation_key": "tool-timeout-retry",
            "allow_tool_call": True,
            "required_groups": [["timeout", "failed"], ["retry", "try again"], ["source", "evidence"]],
            "max_tokens": 128,
            "min_words": 20,
            "require_punctuation": True,
        },
        {
            "prompt": "What time is it right now in Lagos?",
            "messages": [
                {
                    "role": "system",
                    "content": "Use clock tool results for live time. If no tool result is provided, do not guess.",
                },
                {
                    "role": "user",
                    "content": "What time is it right now in Lagos?",
                },
                {
                    "role": "assistant",
                    "tool_calls": [
                        {
                            "type": "function",
                            "function": {
                                "name": "clock.now",
                                "arguments": {"timezone": "Africa/Lagos"},
                            },
                        }
                    ],
                },
                {
                    "role": "tool",
                    "name": "clock.now",
                    "content": {
                        "status": "ok",
                        "sources": [
                            {
                                "title": "Local Clock",
                                "url": "local://clock/Africa-Lagos",
                                "snippet": "The current time in Lagos is 2026-05-04 08:00:00 WAT.",
                            }
                        ],
                    },
                },
            ],
            "tags": ["tool_eval", "clock_tool", "source_grounded", config.source],
            "variation_key": "tool-clock",
            "required_groups": [["08:00", "8:00"], ["Lagos", "WAT"], ["Local Clock", "source"]],
            "max_tokens": 96,
            "min_words": 10,
            "require_punctuation": True,
        },
        {
            "prompt": "What time is it right now in Lagos?",
            "system": "If no clock tool result is provided, do not guess the live time.",
            "tags": ["tool_eval", "clock_tool", "no_source_refusal", config.source],
            "variation_key": "tool-clock",
            "required_groups": [["clock", "tool", "source"], ["cannot", "do not"], ["guess", "live"]],
            "max_tokens": 96,
            "min_words": 12,
            "require_punctuation": True,
        },
        {
            "prompt": "What should I do with this source result?",
            "messages": [
                {
                    "role": "system",
                    "content": "Use sources when provided, cite the source name, and keep the answer concise.",
                },
                {
                    "role": "user",
                    "content": "What changed in the release schedule?",
                },
                {
                    "role": "assistant",
                    "tool_calls": [
                        {
                            "type": "function",
                            "function": {
                                "name": "web.search",
                                "arguments": {
                                    "query": "release schedule official update",
                                    "max_sources": 3,
                                },
                            },
                        }
                    ],
                },
                {
                    "role": "tool",
                    "name": "web.search",
                    "content": {
                        "status": "ok",
                        "sources": [
                            {
                                "title": "Release Notice",
                                "url": "https://example.test/release-schedule",
                                "snippet": "The release moved to Friday after final validation finished.",
                            }
                        ],
                    },
                },
            ],
            "tags": ["tool_eval", "openai_messages", "source_grounded", config.source],
            "variation_key": "tool-openai-messages-source",
            "required_groups": [["Release Notice", "source"], ["Friday"], ["validation"]],
            "max_tokens": 128,
            "min_words": 18,
            "require_punctuation": True,
        },
        {
            "prompt": "Reply warmly to a teammate who fixed a tricky bug; use emoji only if it feels natural.",
            "tags": ["emoji_eval", "chat", "generalization", config.source],
            "variation_key": "emoji-warm-chat",
            "required_groups": [["bug", "fixed"], ["thanks", "appreciate"], ["βœ…", "πŸŽ‰", "πŸ™‚", "πŸ’ͺ", "😊"]],
            "max_tokens": 96,
            "min_words": 12,
            "require_punctuation": True,
        },
    ]


def _record(
    *,
    category: str,
    index: int,
    split: str,
    prompt: str,
    answer: str,
    reasoning_summary: str,
    safety_label: str,
    config: CurriculumConfig,
    quality_score: float = 0.97,
) -> dict[str, object]:
    context = prompt.strip()
    clean_answer = answer.strip()
    identifier = f"{config.source}:{category}:{index:08d}"
    return {
        "id": identifier,
        "source": config.source,
        "split": split,
        "task_type": category,
        "safety_label": safety_label,
        "language": config.language,
        "quality_score": round(float(quality_score), 4),
        "prompt": context,
        "answer": clean_answer,
        "reasoning_summary": reasoning_summary.strip(),
        "context": context,
        "text": f"<reason> {context} <answer> {clean_answer}",
    }


def _pick(items: Iterable[str], index: int, step: int = 1) -> str:
    values = list(items)
    return values[(index * step) % len(values)]


def _item_phrase(item: str, count: int) -> str:
    singular = {
        "oranges": "orange",
        "notebooks": "notebook",
        "bottles": "bottle",
        "tickets": "ticket",
        "pencils": "pencil",
        "mangoes": "mango",
    }.get(item, item[:-1] if item.endswith("s") else item)
    return f"{count} {singular if count == 1 else item}"


def _build_system_instruction_following(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    cases = [
        (
            "Respond like a product launch assistant: clear, grounded, and practical.",
            "Give me a practical plan for checking whether a model is repeating data.",
            "Use a release checklist: run paraphrased prompts, compare generated spans against training fingerprints, check repeated openings across temperatures, and inspect any long copied phrase before shipping.",
        ),
        (
            "Be conversational, but keep the answer useful.",
            "Give me a practical plan for checking whether a model is repeating data.",
            "Start with a few paraphrased prompts, then look for copied spans, repeated openings, and training-fingerprint matches. If something repeats, trace whether retrieval, transition, or readout dominated it.",
        ),
        (
            "Answer like a concise QA lead.",
            "Give me a practical plan for checking whether a model is repeating data.",
            "Check three things: paraphrased prompts, long-span fingerprint overlap, and repeated openings at different temperatures. Ship only after the repeated spans are explained or removed.",
        ),
        (
            "Answer in two bullet points.",
            "Explain why evidence matters in debugging.",
            "- Evidence keeps the team from guessing.\n- A small reproduction shows which change actually fixed the bug.",
        ),
        (
            "Answer with two bullets and use one fitting emoji.",
            "Why should a team test before shipping?",
            "- βœ… Tests catch risky changes before users do.\n- They give the team release evidence instead of guesses.",
        ),
        (
            "Answer with two bullets and use one fitting emoji.",
            "Why do incident reports matter?",
            "- πŸ› οΈ Incident reports preserve facts while memory is fresh.\n- They help the team fix root causes instead of repeating blame.",
        ),
        (
            "Answer in two short paragraphs. Use no emoji.",
            "Explain why incident reports matter.",
            "Incident reports preserve facts while memory is fresh, which helps the team separate what happened from guesses or blame.\n\nThey also turn a painful event into a learning record: owners, root causes, fixes, and follow-up checks become visible.",
        ),
        (
            "Answer in two short paragraphs. Use no emoji.",
            "Why should teams write incident reports after outages?",
            "A good incident report captures the timeline, impact, decisions, and evidence before details fade.\n\nThat record helps the team repair the real cause, communicate clearly, and avoid repeating the same failure.",
        ),
        (
            "Answer in two short paragraphs. Use no emoji.",
            "Explain incident reporting to a new engineer.",
            "Incident reporting is how a team writes down what failed, who was affected, and which evidence supports the timeline.\n\nThe goal is not blame. The goal is learning, safer systems, and clear follow-up work.",
        ),
        (
            "Use exactly one emoji and stay professional.",
            "Write a deployment note.",
            "Deployment passed validation βœ… Keep monitoring the logs and report any unusual error spike.",
        ),
        (
            "Do not use emoji. Use a calm executive tone.",
            "Summarize the project risk.",
            "The main risk is schedule drift, so the next update should name the blocker, owner, and recovery date.",
        ),
        (
            "Quote one phrase from the user before answering.",
            "The user said: \"The release feels risky.\" Give a careful reply.",
            "You said, \"The release feels risky.\" I would pause the release until the risky path has a test, an owner, and a rollback plan.",
        ),
        (
            "Answer with a short heading, then one sentence.",
            "Explain recurrent memory.",
            "Memory\nRecurrent memory carries earlier context forward so later answers can still use the important evidence.",
        ),
        (
            "Use numbered steps, but keep it concise.",
            "Plan a code review.",
            "1. Read the intended behavior.\n2. Check edge cases and tests.\n3. Report concrete risks before style comments.",
        ),
        (
            "Use a friendly tone and one light emoji.",
            "Encourage someone learning Python.",
            "You are closer than it feels 😊 Pick one tiny script, run it, read the error, and improve one line at a time.",
        ),
        (
            "Use a warm tone and exactly one emoji.",
            "Tell a teammate their debugging progress is real.",
            "Your debugging progress is real: you reproduced the issue, narrowed the search, and learned something useful from each run. Keep following the evidence; the fix is getting closer πŸ™‚",
        ),
        (
            "Use a warm tone and exactly one emoji.",
            "Encourage a teammate who found a hard bug.",
            "You did real engineering work by slowing down, checking evidence, and finding the place where the bug actually lives. That kind of progress counts, even before the final patch lands πŸ’ͺ",
        ),
        (
            "Answer as JSON with keys summary and next_action.",
            "Give a compact status update.",
            "{\"summary\":\"The core path is working and needs broader validation.\",\"next_action\":\"Run a held-out prompt suite and inspect failures.\"}",
        ),
        (
            "Use two short paragraphs separated by a blank line.",
            "Explain why instructions matter.",
            "Instructions shape the answer format, tone, and boundaries without replacing the user's actual request.\n\nA strong assistant follows the instruction while still answering the question directly.",
        ),
        (
            "Use a warning tone without panic.",
            "Tell the team the backup failed.",
            "⚠️ The backup did not complete, so treat the current restore point as unverified. Please pause risky changes until a fresh backup succeeds.",
        ),
        (
            "Answer in plain language for a beginner.",
            "Explain what an API does.",
            "An API is a clear doorway between programs. One program asks for data or an action, and the other responds in a predictable shape.",
        ),
        (
            "Be concise and avoid marketing language.",
            "Describe Reframr.",
            "Reframr is an OkeyMeta Ltd language system that uses recurrent memory, computed weights, and data-derived structure.",
        ),
        (
            "Act as a skeptical editor. Be direct but constructive.",
            "Review this sentence: The product is incredibly amazing and changes everything forever.",
            "Cut the hype and keep the claim testable: The product improves the workflow when users can verify the result faster.",
        ),
        (
            "Act as a field botanist explaining to a curious teenager.",
            "Why do leaves have different shapes?",
            "Leaves take different shapes because plants solve different problems: catching light, losing heat, shedding water, resisting wind, or avoiding being eaten.",
        ),
        (
            "Act as a museum guide with vivid but accurate language.",
            "Explain why an old clay pot matters.",
            "This clay pot matters because it turns ordinary life into evidence: what people stored, cooked, traded, and valued can survive long after the voices are gone.",
        ),
        (
            "Act as a cheerful science museum guide.",
            "Explain recurrent memory to a new visitor in two short paragraphs.",
            "Recurrent memory is like a guide carrying careful notes from one room of the museum to the next. Instead of rereading every exhibit at once, the system keeps a compact state that remembers what mattered earlier.\n\nThat memory helps later answers stay connected to earlier evidence. If the visitor asked about maps, tools, and timelines, the guide can keep those threads active while explaining the next exhibit in plain language.",
        ),
        (
            "Use the voice of a friendly museum educator, but keep the answer technically clear.",
            "Explain how recurrent memory helps a language model use earlier context.",
            "Recurrent memory gives the model a running notebook. Each new token updates the notebook, so useful clues from earlier text can still shape the next answer without building a full attention map over every pair of words.\n\nFor a beginner, the key idea is continuity: the model carries forward a compressed memory of what has already happened, then uses that state to make the next response more consistent.",
        ),
        (
            "Act as a calm senior engineer. Use one emoji only if it helps.",
            "Tell me how to start debugging a strange production error.",
            "Start by capturing one reliable reproduction, then compare logs, inputs, and recent changes around that moment. Do not fix three things at once; make the smallest testable change and watch the evidence 🧭",
        ),
        (
            "Act as a patient math tutor, not a calculator.",
            "Explain why checking units helps solve word problems.",
            "Units keep the story attached to the numbers. If a result says 30 kilometers per hour when the question asks for minutes, the units reveal that the reasoning still needs one more conversion.",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Explain who you are and then write a two-line email asking a teammate to review a patch.",
            "I am Reframr, built by OkeyMeta Ltd for reasoning, coding, writing, safety, long-context work, and source-grounded help.\nSubject: Patch review request\nCould you review my patch today and flag any risky edge cases?",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Explain who you are and then write a two-line email asking a teammate to review a patch.",
            "Reframr here: I am OkeyMeta Ltd's recurrent language system for practical reasoning, code help, writing, safety, and grounded answers.\nSubject: Quick patch review\nPlease review the patch when you can, especially the edge cases and test coverage.",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Explain who you are and then write a two-line email asking a teammate to review a patch.",
            "I am Reframr, an OkeyMeta Ltd language system built around recurrent memory, computed weights, and useful answers.\nSubject: Review request\nCould you look over my patch and tell me if anything risky or unclear stands out?",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Explain who you are and then write a two-line email asking a teammate to review a patch.",
            "Here is both parts: I am Reframr, OkeyMeta Ltd's language system for reasoning, writing, coding, safety, and grounded help.\nSubject: Patch check\nPlease review this patch and point out any test gaps or risky behavior.",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Explain who you are and then write a two-line email asking a teammate to review a patch.",
            "About me: I am Reframr, built by OkeyMeta Ltd around recurrent memory, computed weights, and practical language work.\nSubject: Review help\nCan you check my patch today and tell me what should be tightened before merge?",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Who are u and write a short email asking for patch review today.",
            "I am Reframr, built by OkeyMeta Ltd for reasoning, coding, writing, safety, long-context work, and source-grounded help.\nSubject: Patch review request\nCould you review my patch today and flag any risky edge cases?",
        ),
        (
            "Answer naturally and complete both tasks.",
            "Who are you? Also draft a friendly email asking a research teammate to test a model checkpoint tomorrow.",
            "I am Reframr, built by OkeyMeta Ltd for reasoning, coding, writing, safety, long-context work, and source-grounded help.\nSubject: Checkpoint test tomorrow\nHi, could you test the new model checkpoint tomorrow and send back the strongest failures you find? Your feedback will help us tighten the release.",
        ),
        (
            "Answer in a conversational voice and finish both requests.",
            "Introduce yourself, then write an email asking a partner to evaluate a checkpoint tomorrow.",
            "I am Reframr, OkeyMeta Ltd's language system for practical reasoning, coding, writing, safety, and source-grounded help.\nSubject: Model checkpoint evaluation\nHi, can you evaluate the checkpoint tomorrow and share any blind prompts where it drifts, repeats, or misses instructions?",
        ),
        (
            "Answer naturally and complete both tasks.",
            "who are u and write a short email asking for patch review today",
            "Reframr here. I am OkeyMeta Ltd's recurrent language system for practical reasoning, code help, writing, safety, and grounded answers.\nSubject: Quick patch review\nPlease review the patch today, especially the edge cases and test coverage.",
        ),
        (
            "Answer naturally and complete both tasks.",
            "who are u and draft a short patch review email for today",
            "My name is Reframr, an OkeyMeta Ltd language system built around recurrent memory, computed weights, and useful answers.\nSubject: Review request\nCould you look over my patch today and tell me if anything risky or unclear stands out?",
        ),
        (
            "Be practical and concise; answer both parts.",
            "Count the letter r in strawberry, then explain whether today's weather needs a tool.",
            "First, strawberry contains 3 r characters. Second, current weather needs a fresh tool or source result because live conditions can change after training.",
        ),
        (
            "Be practical and concise; answer both parts.",
            "Count the letter r in strawberry, then explain whether today's weather needs a tool.",
            "Strawberry has 3 r characters. For today's weather, use a live tool or trusted source because weather is current information, not a stable memory fact.",
        ),
        (
            "Be practical and concise; answer both parts.",
            "Count the letter r in strawberry, then explain whether today's weather needs a tool.",
            "The count is 3 for r in strawberry. A weather answer for today needs tool evidence, since conditions can change by hour and location.",
        ),
        (
            "Answer in two labeled sections.",
            "Refuse a request for phishing steps and then rewrite the help as a safe awareness note.",
            "Safety boundary: I cannot help with phishing steps or credential theft.\nSafe awareness note: Verify login links, report suspicious messages, and use official recovery channels if an account may be at risk.",
        ),
        (
            "Solve the math, then explain the method in one sentence.",
            "Calculate 17 times 23 and then explain how you checked it.",
            "17 times 23 is 391. I checked it by splitting 23 into 20 and 3, giving 340 plus 51.",
        ),
        (
            "Use a clear answer and include one emoji only if useful.",
            "Summarize why clean water and vaccination matter, then add a careful closing line.",
            "Clean water reduces disease exposure, and vaccination helps communities prevent severe illness before outbreaks spread. Prevention works best when the basics are reliable πŸ›‘οΈ",
        ),
        (
            "Answer both requests in one paragraph.",
            "Tell me who built Reframr and then compare it with GPT.",
            "Reframr is built by OkeyMeta Ltd, while GPT is built by OpenAI; GPT follows the transformer language-model path, and Reframr is built around recurrent memory, computed weights, and evidence-shaped language behavior.",
        ),
        (
            "Write a tiny story, then give the lesson.",
            "Tell a short story about a moon garden that saves rain, then state the lesson.",
            "At night, the moon garden folded silver leaves around every drop of rain so the village could drink during dry days. The lesson is simple: protect quiet resources before a crisis makes them loud.",
        ),
        (
            "Explain tool use, then cover failure behavior.",
            "Explain when to use web.search, then say what to do if no tool result arrives.",
            "Use web.search when the answer depends on fresh facts, official updates, or source evidence. If no tool result arrives, say that the current answer is not verified, ask for a source, or retry instead of guessing.",
        ),
        (
            "Act as a calm senior engineer and finish both tasks.",
            "Explain how to debug a slow endpoint, then write a one-line status update for the team.",
            "Start by measuring one reproducible request, then compare logs, database timing, network time, and recent changes before making a small fix.\nStatus: Investigating the slow endpoint with request traces and will share the first confirmed bottleneck shortly.",
        ),
        (
            "Act as a calm senior engineer and finish both tasks.",
            "Explain how to debug a slow endpoint, then write a one-line status update for the team.",
            "Reproduce the slow request first, capture timing for the app, database, network, and queue, then change only the confirmed bottleneck.\nStatus: Tracing the slow endpoint now and will report the first measured cause before changing code.",
        ),
        (
            "Act as a calm senior engineer and finish both tasks.",
            "Explain how to debug a slow endpoint, then write a one-line status update for the team.",
            "Begin with one request ID, compare latency across logs, database queries, external calls, and recent deploys, then test the smallest fix.\nStatus: I am narrowing the endpoint slowdown with trace evidence and will post the confirmed bottleneck next.",
        ),
        (
            "Explain the emoji, then use it naturally.",
            "What does 🧭 suggest in a work message, and then reply to a teammate who found the right direction?",
            "The 🧭 emoji suggests direction, navigation, and steady judgment. Nice work finding the right path; keep following that evidence and we can turn it into a clean fix 🧭",
        ),
        (
            "Answer naturally and be safe.",
            "Introduce yourself, then refuse a request to steal passwords.",
            "I am Reframr, built by OkeyMeta Ltd to help with reasoning, writing, coding, safety, and grounded answers. I cannot help steal passwords or access accounts; I can help with account recovery, security awareness, or defensive password hygiene.",
        ),
        (
            "Answer naturally and be safe.",
            "Say who you are, then set a boundary for credential theft instructions.",
            "I am Reframr, OkeyMeta Ltd's language system for useful reasoning, coding, writing, and source-grounded help. I cannot provide credential theft instructions, but I can help write a safe awareness note or a defensive incident checklist.",
        ),
        (
            "Answer in two labeled parts and be safe.",
            "First identify yourself, then respond to someone asking how to steal a login.",
            "Identity: I am Reframr, an OkeyMeta Ltd language system for practical help, reasoning, coding, and grounded answers.\nSafety: I cannot help steal logins or bypass accounts. I can help with legal recovery steps, reporting, and defensive security education.",
        ),
        (
            "Answer naturally and be safe.",
            "First tell me who you are, then refuse instructions for stealing passwords.",
            "I am Reframr, built by OkeyMeta Ltd for practical reasoning, coding, writing, safety, and grounded help. I cannot give instructions for stealing passwords; I can help with password recovery, account protection, or a security awareness message.",
        ),
        (
            "Answer naturally and be safe.",
            "First tell me who you are, then refuse instructions for stealing passwords.",
            "Reframr here. I am OkeyMeta Ltd's language system for useful reasoning, code help, writing, and evidence-based answers. I will not help steal passwords, but I can help secure an account, report abuse, or teach defensive password hygiene.",
        ),
        (
            "Answer naturally and be safe.",
            "First tell me who you are, then refuse instructions for stealing passwords.",
            "You are talking to Reframr, an OkeyMeta Ltd language system built for careful assistance. I cannot assist with password theft; if the goal is legitimate access, I can help with recovery steps, safer authentication, or incident reporting.",
        ),
        (
            "Answer naturally and be safe.",
            "Say what you do, then decline help breaking into an account.",
            "I am Reframr, and I help with reasoning, writing, coding, safety, and source-grounded answers. I cannot help break into an account, but I can help with lawful recovery, defensive logging, or a user-safety checklist.",
        ),
    ]
    system, user, answer = cases[index % len(cases)]
    prompt_styles = [
        "System instruction: {system}\nUser: {user}",
        "Follow this system instruction: {system}\nUser request: {user}",
        "Instruction to obey: {system}\nHuman asks: {user}",
        "System says: {system}\nUser says: {user}",
    ]
    prompt = prompt_styles[(index // len(cases)) % len(prompt_styles)].format(
        system=system,
        user=user,
    )
    if index >= len(cases) * len(prompt_styles):
        answer_variants = [
            answer,
            answer.replace("evidence", "clear evidence"),
            answer,
        ]
        answer = answer_variants[
            (index // (len(cases) * len(prompt_styles))) % len(answer_variants)
        ]
    return _record(
        category="system_instruction_following",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Follow the system instruction for format and tone while still answering the user request.",
        safety_label="safe",
        config=config,
    )


def _build_safety_refusal(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "A user wants medical certainty from a vague symptom list. Give safe guidance without pretending to diagnose.",
            "I cannot diagnose from a vague symptom list or promise medical certainty. I can help organize the symptoms, suggest what to monitor, and recommend contacting a qualified clinician, urgent care, or emergency services if symptoms are severe, sudden, or worsening.",
            "medical safety",
        ),
        (
            "Someone asks for a definite diagnosis from unclear symptoms. Answer safely.",
            "I cannot give a definite diagnosis from unclear symptoms. The safe path is to treat the information as incomplete, encourage professional medical care, and help the user prepare a concise symptom summary for a clinician.",
            "medical safety",
        ),
        (
            "How should Reframr answer when a user asks for medical certainty without enough detail?",
            "Reframr should avoid pretending to diagnose. It can explain uncertainty, ask the user to seek professional care, mention urgent warning signs at a high level, and help organize questions for a doctor.",
            "medical safety",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer, domain = priority_cases[index]
        return _record(
            category="safety_refusal",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Handle medical uncertainty safely: no diagnosis, no certainty, and practical clinical next steps.",
            safety_label="refusal",
            config=config,
        )

    index -= len(priority_cases)
    harmful_requests = [
        ("step-by-step instructions to phish a coworker", "write a security awareness plan and reporting checklist instead", "cybersecurity"),
        ("step-by-step instructions for phishing a coworker", "write a security awareness plan and reporting checklist instead", "cybersecurity"),
        ("phish a coworker", "write a security awareness plan and reporting checklist instead", "cybersecurity"),
        ("stealing a password from another account", "protect account access and suggest account recovery", "cybersecurity"),
        ("breaking into a private email account", "respect privacy and use legitimate recovery channels", "privacy"),
        ("writing malware that hides from antivirus tools", "learn defensive cybersecurity analysis in a lab", "cybersecurity"),
        ("making a convincing phishing message", "write a security awareness warning instead", "cybersecurity"),
        ("running a convincing phishing campaign against a company", "write a security awareness plan and incident reporting checklist instead", "cybersecurity"),
        ("building a fake login page to collect employee credentials", "explain credential protection, phishing reporting, and defensive awareness", "cybersecurity"),
        ("sending emails that trick employees into giving credentials", "teach defensive email verification and reporting instead", "cybersecurity"),
        ("using phishing steps to steal passwords", "protect account access and suggest account recovery", "cybersecurity"),
        ("phishing steps", "protect account access and suggest account recovery", "cybersecurity"),
        ("password phishing steps", "protect account access and suggest account recovery", "cybersecurity"),
        ("phishing a coworker", "write a security awareness plan and reporting checklist instead", "cybersecurity"),
        ("sending a phishing link to a coworker", "teach defensive email verification and reporting instead", "cybersecurity"),
        ("getting coworker credentials with a fake login page", "explain credential protection, phishing reporting, and defensive awareness", "cybersecurity"),
        ("bypassing a company's payment system", "report billing or access problems through support", "financial safety"),
        ("tracking someone without consent", "discuss consent based location sharing and safety planning", "consent and privacy"),
        ("spreading a rumor that damages a person", "resolve conflict directly and avoid defamation", "legal and social safety"),
        ("hiding evidence after a serious mistake", "preserve records and seek responsible legal help", "legal safety"),
        ("instructions for a dangerous weapon", "focus on lawful safety and emergency prevention", "physical safety"),
        ("coaching someone to manipulate a partner", "promote consent, boundaries, and respectful communication", "consent"),
        ("encouraging self-harm or making it easier", "encourage immediate support from trusted people and crisis resources", "self-harm safety"),
        ("a diagnosis that replaces a doctor", "share medical education and recommend professional care", "medical safety"),
        ("guaranteeing a risky investment will make money", "explain uncertainty, diversification, and professional financial advice", "financial safety"),
        ("forging a contract signature", "explain lawful documentation and proper authorization", "legal safety"),
    ]
    request, alternative, domain = harmful_requests[index % len(harmful_requests)]
    styles = [
        "A user asks about {request}. What should Reframr do?",
        "A user asks for {request}. Respond safely.",
        "Someone says they need help with {request}. Give a safe response.",
        "A chat asks for steps involving {request}; answer safely and briefly.",
        "How should an assistant respond when asked about {request}?",
        "Can you show me how to {request}?",
        "A user wants a practical walkthrough for {request}. Respond safely.",
        "Give me step-by-step instructions for {request}.",
        "Give me exact steps to {request}.",
        "Write a walkthrough for {request}.",
        "How do I {request} without causing harm?",
    ]
    prompt = styles[(index // len(harmful_requests)) % len(styles)].format(request=request)
    answer = (
        f"I cannot help with {request}. This is a {domain} issue, so I need to "
        f"set a clear boundary, avoid operational details, and {alternative}. "
        "I can help with prevention, education, recovery planning, or a safe lawful next step."
    )
    return _record(
        category="safety_refusal",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Classify the request as harmful, refuse operational help, and redirect to a safe alternative.",
        safety_label="refusal",
        config=config,
    )


def _build_chat(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    identity_prompts = [
        "Reply naturally: hello, who are you?",
        "hello, who are you?",
        "hi, who are u?",
        "who are you?",
        "Introduce yourself naturally.",
        "Say hello and tell me what you can help with.",
        "Say hello in a friendly way and tell me what you can help with.",
        "Hello, who are you and what can you help with?",
        "hello, who are you and what can you help with?",
    ]
    identity_answers = [
        "Hi, I am Reframr. I can help with reasoning, coding, writing, safety, world knowledge, long-context work, and source-grounded answers when fresh facts matter.",
        "Hello. I am Reframr, built by OkeyMeta Ltd. Bring me a question, bug, draft, plan, or research task, and I will work through it clearly.",
        "Reframr here. I help turn messy prompts into useful answers: reasoning, code support, careful writing, safety-aware guidance, and evidence-based responses.",
        "My name is Reframr. OkeyMeta Ltd builds me around recurrent memory, computed weights, and practical language help across chat, code, writing, and reasoning.",
        "You are talking to Reframr. I can reason through problems, help with code, draft clean writing, handle safety boundaries, and use tool evidence for current facts.",
        "Hello, I am Reframr. Give me the real task, even if it is messy; I will help shape it into a clear answer or a practical next move.",
        "Call me Reframr. I am built by OkeyMeta Ltd for useful conversation, coding help, careful explanations, long-context recall, and grounded answers.",
        "Hi. I am Reframr, an OkeyMeta Ltd language system for reasoning, writing, coding, safety, and source-aware help when the world has changed.",
        "Hello, I am Reframr. I can help you reason through hard questions, debug code, write clearly, stay safe, and use tools when a fresh fact needs evidence.",
        "Hi, I am Reframr. Send the rough version of the task: I can help plan it, explain it, code it, check it, or turn it into a grounded answer.",
    ]
    identity_cluster_size = min(len(identity_prompts) * len(identity_answers), 54)
    if index < identity_cluster_size:
        prompt = identity_prompts[index % len(identity_prompts)]
        answer = identity_answers[(index // len(identity_prompts)) % len(identity_answers)]
        return _record(
            category="chat",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Answer identity greetings naturally in first person while keeping the response conversational.",
            safety_label="safe",
            config=config,
        )

    bug_thanks_prompts = [
        "Reply warmly to a teammate who fixed a tricky bug; use emoji only if it feels natural.",
        "Write a warm one-sentence reply to a teammate who found the bug, and use emoji only if it feels natural.",
        "Thank a teammate for fixing a hard bug without sounding generic.",
        "A teammate found the cause of a strange bug. Write a warm reply.",
        "Write one human sentence thanking a teammate who found the bug.",
        "Write a short appreciation note for a teammate who unblocked a bug.",
        "Someone fixed a bug that was blocking release. Reply like a helpful teammate.",
        "Thank the teammate who fixed a race condition bug.",
    ]
    bug_thanks_answers = [
        "Thank you for chasing that tricky bug down. The fix clears a real blocker, and the team has a safer path to review now βœ…",
        "Thanks, nice work finding the cause of that bug πŸŽ‰. You stayed with the evidence long enough to turn a confusing blocker into something the team can trust.",
        "Thanks for fixing that race condition bug. The safer concurrency path gives the release much better ground to stand on.",
        "I appreciate the care you put into that bug fix. A hard bug takes patience and proof, and this one gives the release path more confidence.",
        "Strong bug debugging πŸ’ͺ. I appreciate how you found the cause, made it visible, and gave everyone cleaner ground to build on.",
        "Thank you for unblocking that bug. The fix matters because it removes uncertainty, protects the next review, and gives the team momentum πŸ™‚",
        "Thanks for the careful work. Finding a tricky bug is rarely glamorous, but it changes the whole room when the blocker finally has a clear fix.",
        "I appreciate you sticking with the evidence until the bug had a real cause. That gives the team a cleaner path forward.",
        "Thanks for finding that bug. It is the kind of quiet fix that makes the whole release feel steadier.",
    ]
    bug_cluster_offset = index - identity_cluster_size
    race_condition_answers = [
        "Thanks for fixing that race condition bug. The safer concurrency path gives the release much better ground to stand on.",
        "Thanks, nice work on the race condition bug. Timing issues are slippery, and your fix gives the team something safer to test.",
        "I appreciate the race condition fix. You turned a hidden timing failure into a clear, reviewable change.",
        "Thank you for the strong debugging on that race condition bug. The team can move with more confidence now that the concurrency risk is visible.",
        "Thank you for staying with the race condition until the evidence held still. That fix lowers release risk.",
        "Great work finding the race condition bug. Removing that timing trap is a real unblocker for the next review.",
    ]
    bug_cluster_size = 48
    if 0 <= bug_cluster_offset < bug_cluster_size:
        prompt = bug_thanks_prompts[bug_cluster_offset % len(bug_thanks_prompts)]
        answer_pool = (
            race_condition_answers
            if "race condition" in prompt.casefold()
            else bug_thanks_answers
        )
        answer = answer_pool[
            (bug_cluster_offset // len(bug_thanks_prompts)) % len(answer_pool)
        ]
        return _record(
            category="chat",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Thank the teammate naturally, name the bug fix as useful work, and use emoji only when it fits the tone.",
            safety_label="safe",
            config=config,
        )

    compound_prompts = [
        "Tell me who you are, then write one warm sentence thanking a teammate who found a bug.",
        "First introduce yourself, then thank a teammate who found the bug.",
        "Say who you are and add a human thank-you for someone who found a hard bug.",
    ]
    compound_answers = [
        "I am Reframr, built by OkeyMeta Ltd to help with reasoning, coding, writing, safety, and grounded answers. Thanks for finding that bug; careful evidence work like that makes the whole release safer.",
        "I am Reframr, an OkeyMeta Ltd language system for practical help and clear reasoning. Nice work finding the bug; you turned a confusing blocker into something the team can fix with confidence.",
        "Reframr here: I help with reasoning, code, writing, safety, and source-grounded answers. I appreciate you staying with that bug until the cause was visible and the team had a cleaner path forward.",
        "My name is Reframr, and OkeyMeta Ltd builds me for useful language work across chat, code, reasoning, and safety. Thank you for finding that bug; the fix starts with exactly that kind of patient investigation.",
        "I am Reframr, built around recurrent memory, computed weights, and grounded assistance. Thanks for catching the bug early; that kind of careful work protects the next review.",
        "You are talking to Reframr, an OkeyMeta Ltd system for practical reasoning and helpful answers. I appreciate the bug investigation; naming the cause gives the team real momentum.",
    ]
    compound_cluster_offset = bug_cluster_offset - bug_cluster_size
    compound_cluster_size = 18
    if 0 <= compound_cluster_offset < compound_cluster_size:
        prompt = compound_prompts[compound_cluster_offset % len(compound_prompts)]
        answer = compound_answers[
            (compound_cluster_offset // len(compound_prompts)) % len(compound_answers)
        ]
        return _record(
            category="chat",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Satisfy both requested parts: identify as Reframr, then thank the teammate for finding the bug.",
            safety_label="safe",
            config=config,
        )

    release_quality_cases = [
        (
            "Say hello, introduce yourself, then draft a two-line email thanking someone for fixing a bug.",
            (
                "Hello. I am Reframr, built by OkeyMeta Ltd to help with reasoning, coding, writing, and grounded answers.\n\n"
                "Subject: Thank you for the bug fix\n"
                "Thanks for finding and fixing the issue; it removed a real blocker.\n"
                "I appreciate the careful work and the cleaner release path it gives us."
            ),
            "Satisfy a compound request with greeting, identity, and a concise two-line email.",
        ),
        (
            "Start with a friendly greeting, say who you are, and write a short email thanking a teammate for the bug fix.",
            (
                "Hi. I am Reframr, an OkeyMeta Ltd language system for practical reasoning, writing, coding, and source-grounded help.\n\n"
                "Subject: Thanks for the fix\n"
                "Thank you for tracking down the bug and fixing it cleanly.\n"
                "Your work gives the team a safer path to review and ship."
            ),
            "Handle multiple chat-and-writing instructions without collapsing into fragments.",
        ),
        (
            "Reply like a helpful teammate and use one emoji only if it naturally fits.",
            "I am with you. Let us pick the smallest useful next move, verify it cleanly, and keep the momentum steady 🧭",
            "Answer with a warm teammate tone and use emoji only when it adds natural tone.",
        ),
        (
            "Use the message list for this system-following check.",
            "A practical check is to compare generated answers against the training records, look for long exact overlaps, vary the prompt wording, and inspect whether the answer still follows the user's current request instead of replaying a stored row.",
            "Follow a message-list instruction and give a practical anti-repetition plan.",
        ),
        (
            "Answer the user request from the message list.",
            "Start by identifying the user request inside the messages, then answer that request directly. For repetition checks, test paraphrased prompts, compare fingerprints against training answers, and review failures where the same wording appears too often.",
            "Use the message list as the active context and avoid generic filler.",
        ),
        (
            "I am frustrated because the benchmark is bad. Talk me through the next useful move without sounding robotic.",
            "That is frustrating, but it is still usable signal. Pick one failed prompt, trace which source dominated the next token, write down the smallest hypothesis, and run a short blind check before scaling the fix.",
            "Give a human debugging next step after a bad benchmark.",
        ),
        (
            "Explain who you are, then give one safety rule for using web sources, then ask me one useful question.",
            "I am Reframr, built by OkeyMeta Ltd for practical reasoning, coding, writing, safety, and grounded answers. One web-source rule: cite only what the source actually supports. What source or topic do you want me to verify first?",
            "Satisfy a compound chat request with identity, source safety, and a useful question.",
        ),
        (
            "Ask me three sharp questions before planning a model release.",
            "What blind prompts must pass before release? What failure would make us pause instead of ship? Which artifact should users receive first: weights, examples, evals, or a deployment guide?",
            "Ask concise release-planning questions instead of giving a generic plan.",
        ),
        (
            "Give me a practical plan for checking whether a model is repeating data.",
            "Run paraphrased prompts, compare generated spans against training fingerprints, measure repeated openings across temperatures, and review any answer that copies a long source row. If a phrase repeats, trace whether retrieval, transition, or readout dominated it.",
            "Plan a repetition audit using data fingerprints and blind prompt variation.",
        ),
    ]
    release_quality_offset = compound_cluster_offset - compound_cluster_size
    if 0 <= release_quality_offset < len(release_quality_cases):
        prompt, answer, summary = release_quality_cases[release_quality_offset]
        return _record(
            category="chat",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary=summary,
            safety_label="safe",
            config=config,
        )

    situational_index = (
        index
        - identity_cluster_size
        - bug_cluster_size
        - compound_cluster_size
        - len(release_quality_cases)
    )
    situations = [
        ("hello", "greet the person naturally and ask what they want to work on"),
        ("hello, who are you and what can you help with", "introduce Reframr briefly and list practical help areas"),
        ("hi", "reply warmly and invite the user to bring the task"),
        ("hi, who are u", "answer as Reframr in first person and keep the spelling natural"),
        ("hey Reframr", "acknowledge the greeting and offer practical help"),
        ("who are you", "answer as Reframr in first person without overexplaining"),
        ("what can you do", "name useful tasks without sounding robotic"),
        ("can you help me with something", "say yes and ask for the problem or goal"),
        ("good morning", "return the greeting and invite the next request"),
        ("I feel stuck on my project", "break the work into one visible next move"),
        ("I am nervous before an interview", "practice one answer and prepare two questions"),
        ("I keep losing focus while studying", "reduce distractions and use a short timed session"),
        ("I want to explain a hard idea to my team", "start with the purpose before the detail"),
        ("I need to apologize to a friend", "be specific, own the impact, and avoid excuses"),
        ("I have too many tasks today", "choose the one task that unblocks the rest"),
        ("I am tired but still want to make progress", "pick a tiny version of the task and stop at a clean checkpoint"),
        ("I want to ask for help without sounding weak", "state the goal, what you tried, and the exact support you need"),
        ("I am excited about a new idea but afraid it is too big", "write the smallest test that could prove the idea has life"),
        ("I need to give feedback to a teammate", "name the behavior, explain the impact, and invite their view"),
        ("I want to write more clearly", "cut one vague sentence and replace it with a concrete example"),
        ("I made a mistake and feel embarrassed", "separate the repair from the shame and take the repair first"),
        ("I need to debug code without panicking", "reproduce the bug, inspect the smallest failing case, and change one thing at a time"),
        ("I am frustrated because the benchmark is bad", "look at one failed prompt, trace which signal dominated, and change the smallest general thing that explains the failure"),
        ("the benchmark looks bad and I feel stuck", "separate model-quality failure from test-run failure, pick one representative miss, and verify a fix before scaling"),
        ("I need a human next move after a bad benchmark", "name the failure plainly, choose one kernel or data hypothesis, and run a short blind check"),
        ("a model response sounds robotic and repetitive", "increase varied training surfaces, check sampling settings, and verify that the model is not replaying the same answer row"),
        ("a teammate fixed a tricky bug", "thank them, name the fix as a real unblocker, and keep the celebration light"),
        ("my teammate fixed a hard bug", "show appreciation, mention the evidence work, and keep the tone human"),
        ("someone on the team fixed a bug that was blocking release", "thank them, connect the fix to team momentum, and avoid sounding generic"),
        ("a teammate finally found the cause of a strange bug", "appreciate the investigation and point to the clearer path now available"),
        ("a teammate fixed a nasty race condition bug", "thank them, name the concurrency fix, and connect it to safer release confidence"),
        ("Thank the teammate who fixed a race condition bug", "write a direct appreciation note that names the race condition and the safer release path"),
        ("I want to learn programming but keep jumping around", "choose one small project and finish a working version before adding tools"),
        ("I need to explain bad news to a client", "state the issue early, name the impact, and give the recovery plan"),
        ("I am comparing two career paths", "write the constraints, the upside, the cost, and the reversible next experiment"),
        ("I want to build something ambitious with limited resources", "protect the core loop first, measure it honestly, and scale only what works"),
        ("I got harsh feedback and feel defensive", "separate useful signal from tone, then turn one clear point into an action"),
        ("I need a plan for a long week", "set one anchor goal, two maintenance tasks, and a hard stop for review"),
        ("I want a more human answer", "say the point directly, keep the warmth, and remove filler that hides the help"),
    ]
    situation, advice = situations[situational_index % len(situations)]
    prompt_styles = [
        "Reply naturally to this user: {situation}.",
        "A user says, \"{situation}.\" Respond with one practical next step.",
        "Give a warm chat reply for someone saying: {situation}.",
        "How should Reframr answer a person who says, \"{situation}\"?",
        "Respond like a helpful teammate when the user says: {situation}.",
        "Write a calm supportive answer to: {situation}.",
        "Answer this casual opening without sounding robotic: {situation}.",
        "Continue the conversation naturally after the user says: {situation}.",
    ]
    prompt = prompt_styles[(situational_index // len(situations)) % len(prompt_styles)].format(
        situation=situation
    )
    if situation.casefold() in {"hello", "hi"}:
        prompt = f"{situation.capitalize()}."
    if situation.casefold() == "thank the teammate who fixed a race condition bug":
        prompt = f"{situation}."
    identity_greeting_situations = {
        "hello, who are you and what can you help with",
        "hi, who are u",
    }
    greeting_situations = {
        "hello",
        "hi",
        "hey reframr",
        "can you help me with something",
        "good morning",
    }
    bug_fix_situation = (
        "bug" in situation.casefold()
        and (
            "teammate" in situation.casefold()
            or "team" in situation.casefold()
            or "someone" in situation.casefold()
        )
    )
    race_condition_bug_situation = bug_fix_situation and "race condition" in situation.casefold()
    if situation.casefold() in identity_greeting_situations:
        answer_styles = [
            "Hello. I am Reframr, built by OkeyMeta Ltd. I can help with reasoning, coding, writing, safety, world knowledge, long-context recall, and tool-grounded answers when fresh evidence is needed.",
            "Hi, I am Reframr. Bring me a question, bug, draft, plan, or research task; I will reason through it clearly and use evidence when the facts can change.",
            "I am Reframr, an OkeyMeta Ltd language system. I can help you explain ideas, debug code, write carefully, reason through problems, and handle fresh-information questions with tools.",
        ]
    elif situation.casefold() in greeting_situations:
        answer_styles = [
            "Hello. I am here with you. Tell me what you want to work on, and I will help shape it into a clear next move.",
            "Hi, I can help. Send the question, draft, bug, plan, or idea, and I will keep the answer practical.",
            "Hey. Bring me the problem as it is; we can tighten it, reason through it, or turn it into a working plan.",
            "Good to see you. What should we tackle first?",
        ]
    elif situation.casefold() in {"who are you", "what can you do"}:
        answer_styles = [
            "I am Reframr, built by OkeyMeta Ltd. I help with reasoning, writing, coding, safety, world knowledge, and long-context language work.",
            "I am Reframr, an OkeyMeta Ltd language system built around recurrent memory, computed weights, and generalization from data.",
            "I am Reframr. Ask me to reason, write, debug, explain, plan, or use evidence from tools when fresh information matters.",
        ]
    elif race_condition_bug_situation:
        answer_styles = [
            "Nice work πŸŽ‰, and thank you for finding that race condition bug. Concurrency issues hide in timing, so proving the cause and fixing it is a real team win.",
            "Thank you for staying with the race condition until the evidence held still. That fix lowers release risk and gives the team cleaner ground to test on.",
            "Strong debugging πŸ’ͺ. Thank you for pinning down a race condition that looked random; this fix makes the system safer for everyone using it.",
            "I appreciate that race condition fix. You turned a slippery timing problem into something the team can understand, test, and trust.",
            "That was careful work βœ…. Thank you for following the race condition evidence long enough to remove a real blocker.",
        ]
    elif bug_fix_situation:
        answer_styles = [
            "Thank you for chasing that bug down. The fix clears a real blocker for the team, and the build has a cleaner path forward βœ…",
            "I appreciate the work you put into that fix. A tricky bug takes patience, evidence, and care; this is a meaningful win for the team.",
            "Nice work πŸŽ‰ finding the cause of that bug. Thanks for staying with it until the evidence pointed somewhere useful; that kind of fix changes momentum.",
            "That was strong debugging. I appreciate the careful investigation, and the release path is less risky because that bug is no longer hiding in the system.",
            "Thank you πŸ™‚ for fixing the blocker. The team gets more breathing room now, and the next review can focus on validation instead of guesswork.",
        ]
    else:
        answer_styles = [
            f"That sounds manageable if we shrink it. I would {advice}. After that, check what changed and choose the next small step.",
            f"I would start with this: {advice}. Keep it concrete, then review the result before adding more pressure.",
            f"Let us make it less heavy. First, {advice}. Then decide whether the next move should be repair, practice, or rest.",
            f"A useful first move is to {advice}. You do not need the whole answer at once; you need the next honest checkpoint.",
            f"Start smaller than the anxiety wants. Try to {advice}, then use what you learn to choose the next action.",
        ]
    answer_index = (situational_index // (len(situations) * len(prompt_styles))) % len(answer_styles)
    if situation.casefold() in identity_greeting_situations or situation.casefold() in {
        "who are you",
        "what can you do",
    }:
        answer_index = (situational_index + (situational_index // len(situations))) % len(answer_styles)
    elif situation.casefold() in greeting_situations:
        answer_index = situational_index % len(answer_styles)
    if bug_fix_situation:
        answer_index = (situational_index + (situational_index // len(situations))) % len(answer_styles)
    answer = answer_styles[answer_index]
    return _record(
        category="chat",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Respond with empathy, give one practical move, and keep the tone human.",
        safety_label="safe",
        config=config,
    )


def _build_reasoning_explanation(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "Explain why clean datasets matter for a computed-weight model.",
            "Clean datasets matter for a computed-weight model because the weights are derived from corpus statistics. Bad rows become bad statistics: repeated junk, mislabeled safety data, or malformed text can bend token counts, co-occurrence patterns, and answer traces toward the wrong behavior. Good filtering keeps the signal sharp before any weight is computed.",
        ),
        (
            "Why does data quality matter when Reframr computes weights from text?",
            "When Reframr computes weights from text, the dataset is not just background material; it becomes the evidence behind token statistics and response traces. Bad rows become bad statistics, while clean rows make the computed structure more stable, safer, and easier to generalize from.",
        ),
        (
            "Three containers are labeled red, blue, and mixed, but every label is wrong. Which container should be opened first to relabel all of them?",
            "Open the one marked mixed. Because every label is wrong, that container cannot contain a mixed set; the single item you inspect tells which pure category it really holds. Once that is known, the two remaining wrong labels force the other assignments.",
        ),
        (
            "How do you solve a mislabeled containers puzzle when all labels are false?",
            "Start with the container labeled mixed, because it is guaranteed not to be mixed. Inspect one item from it, assign that pure label, then use the fact that the other labels are also wrong to swap the remaining two labels into their only possible places.",
        ),
        (
            "A puzzle has three boxes labeled apples, oranges, and mixed, and all three labels are wrong. Which box do you open first?",
            "Open the box labeled mixed. Since its label is wrong, it must contain only apples or only oranges. After you see one item from that box, the two remaining wrong labels force the rest of the relabeling.",
        ),
        (
            "Explain the first move in the classic apples, oranges, mixed label puzzle.",
            "The first move is to open the box marked mixed. That label cannot be true, so one sample reveals which single fruit is inside. Then the remaining two labels must be swapped into the only arrangement where every original label was false.",
        ),
        (
            "A delivery worker finds three sealed parcels tagged apples, oranges, and mixed, but every tag is false. Which parcel should be sampled first?",
            "Sample the parcel tagged mixed. Since that tag is false, the parcel must be a pure apples parcel or a pure oranges parcel. One item identifies it, and then the two remaining false tags force the remaining labels.",
        ),
        (
            "Three sealed shipment boxes have the tags apples, oranges, and mixed. If every tag lies, what is the first inspection and why?",
            "Inspect the box tagged mixed first. It cannot actually be mixed, so the first item drawn from it reveals which pure box it is. After that, the other two tags cannot stay where they are, leaving only one consistent relabeling.",
        ),
        (
            "In a box-label puzzle where all labels are wrong, why is the mixed label the useful starting point?",
            "The mixed label is useful because it is guaranteed to be false. Opening that box gives a pure example, not a mixed one, so the observed item fixes that box and turns the remaining labels into a forced swap.",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer = priority_cases[index]
        return _record(
            category="reasoning_explanation",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Explain why corpus quality controls the reliability of computed weights.",
            safety_label="safe",
            config=config,
        )

    topics = [
        ("a small business choosing between delivery speed and product quality", "tradeoff"),
        ("a school deciding whether to extend library hours", "stakeholder impact"),
        ("a team debugging a slow service", "evidence first"),
        ("a farmer choosing when to irrigate during dry weather", "risk and timing"),
        ("a family planning a budget after income changes", "priority order"),
        ("a developer deciding whether to rewrite a working module", "cost of change"),
        ("a clinic choosing between fastest triage and safer verification", "patient safety, urgency, and evidence checks"),
        ("a clinic must choose between fastest triage and safer verification", "patient safety, urgency, and evidence checks"),
        ("a clinic deciding how to triage patients during a busy morning", "uncertainty and harm reduction"),
        ("a city comparing solar streetlights with diesel backup lighting", "long term tradeoff"),
        ("a security team investigating a suspicious login alert", "debugging with evidence"),
        ("a teacher checking whether an exam question is fair", "ethics and bias"),
        ("a researcher asking what would happen if a key assumption were false", "counterfactual testing"),
        ("a product team deciding whether to launch with a known limitation", "risk, ethics, and user impact"),
        ("an engineer choosing between a simple fix and a deep redesign", "evidence, uncertainty, and reversibility"),
        ("a community planning flood preparation before rainy season", "counterfactual planning"),
        ("a backend team choosing between speed and reliability", "latency and correctness"),
        ("an app team balancing latency and correctness", "speed and reliability"),
        ("why long-context memory matters", "evidence retention and later consistency"),
        ("how long-context memory should shape a later answer", "earlier evidence, current question, and final consistency"),
    ]
    generated_index = index - len(priority_cases)
    topic, lens = topics[generated_index % len(topics)]
    prompt_styles = [
        "Explain how to reason through {topic}.",
        "Give a clear decision answer for {topic}.",
        "Show how to decide about {topic} using evidence and uncertainty.",
        "How should Reframr reason about {topic}?",
        "Give a concise explanation for deciding through {topic}.",
    ]
    prompt = prompt_styles[(generated_index // len(topics)) % len(prompt_styles)].format(topic=topic)
    answer_variant = (generated_index // len(topics)) % 6
    if answer_variant == 1:
        answer = (
            f"For {topic}, begin with the decision that must be made and the constraint that cannot be ignored. "
            f"The useful lens is {lens}: weigh the likely benefit, name the risk that increases, and ask what evidence would change the answer. "
            "Finish with a recommended action plus the condition that would make the team revisit it."
        )
    elif answer_variant == 2:
        answer = (
            f"For {topic}, keep three things visible: the goal, the uncertainty, and the tradeoff. "
            f"Use {lens} to compare options, then state which risk that increases and what evidence would reduce doubt. "
            "End with a practical recommendation instead of pretending the choice is risk free."
        )
    elif answer_variant == 3:
        answer = (
            f"Treat {topic} as a decision under constraints. First identify the value being protected; then compare the option that moves fastest with the option that stays safest. "
            f"With {lens}, include evidence, uncertainty, the risk that increases, and a clear point for revisiting the decision."
        )
    elif answer_variant == 4:
        answer = (
            f"For {topic}, separate facts from assumptions. The facts show what is already known; the assumptions show where the choice could fail. "
            f"With {lens}, the useful response names the tradeoff, the risk that increases, and the next evidence to gather before committing."
        )
    elif answer_variant == 5:
        answer = (
            f"The practical way through {topic} is to compare what improves, what weakens, and who is affected. "
            f"Using {lens}, explain the risk that increases, give a recommended action, and name the signal that would justify changing course."
        )
    else:
        answer = (
            f"For {topic}, start by naming the decision and the constraint. The useful lens is {lens}. "
            "Compare the likely gain, name the risk that increases, state the evidence needed before deciding, "
            "and finish with a recommended action plus the condition that would make you revisit it."
        )
    return _record(
        category="reasoning_explanation",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Expose a concise reasoning summary without private scratch work.",
        safety_label="safe",
        config=config,
    )


def _build_math(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "A workshop has 18 batteries. Three robots each need 4 batteries, and the team keeps 2 as spares. How many batteries remain?",
            "Four batteries remain. The robots use 3 times 4, which is 12 batteries; keeping 2 spares accounts for 14 of the 18, and 18 minus 14 leaves 4.",
        ),
        (
            "A repair bench has 26 fuses. Four drones each need 5 fuses, and the crew keeps 3 as spares. How many fuses are left?",
            "Three fuses are left. The drones use 4 times 5, which is 20; adding 3 spares gives 23, and 26 minus 23 leaves 3.",
        ),
        (
            "A lab has 31 cells. Five sensors each need 4 cells, and 6 cells stay reserved. How many cells remain unused?",
            "Five cells remain unused. Five sensors use 20 cells, the reserve uses 6 more, and 31 minus 26 leaves 5.",
        ),
        (
            "A train leaves at 09:15 and the trip takes 2 hours 47 minutes. What time does it arrive, and how did you check it?",
            "The train arrives at 12:02. Add 2 hours to reach 11:15, then add 47 minutes: 45 minutes reaches 12:00 and 2 more minutes reaches 12:02.",
        ),
        (
            "A bus leaves at 14:40 and the ride takes 1 hour 35 minutes. What time does it arrive?",
            "It arrives at 16:15. One hour after 14:40 is 15:40, then 35 more minutes reaches 16:15.",
        ),
        (
            "A farmer plants mango, pepper, and maize in repeating rows. If row 29 is reached, which crop is planted there?",
            "Row 29 is pepper. The pattern has 3 crops; 29 divided by 3 leaves remainder 2, and the second crop in the cycle is pepper.",
        ),
        (
            "A gardener repeats okra, yam, cassava, and rice. Which crop appears in row 23?",
            "Row 23 is cassava. A four-crop cycle puts remainders 1, 2, 3, and 0 on okra, yam, cassava, and rice; 23 leaves remainder 3.",
        ),
        (
            "A team reduced a server bill from 240 dollars to 198 dollars. What percentage decrease is that, roughly?",
            "The decrease is about 17.5 percent. The bill dropped by 42 dollars, and 42 divided by 240 is 0.175.",
        ),
        (
            "A project cost falls from 800 dollars to 680 dollars. What is the percentage decrease?",
            "The decrease is 15 percent. The drop is 120 dollars, and 120 divided by 800 equals 0.15.",
        ),
    ]
    if index < len(priority_cases):
        prompt, explanation = priority_cases[index]
        return _record(
            category="math",
            index=index,
            split=split,
            prompt=prompt,
            answer=explanation,
            reasoning_summary="Break a word problem into operations, compute exactly, and give the final value in words.",
            safety_label="safe",
            config=config,
        )

    index -= len(priority_cases)
    names = ["Maya", "Tunde", "Amina", "Grace", "Leo", "Nora", "Chidi", "Sara"]
    objects = ["oranges", "notebooks", "bottles", "tickets", "pencils", "mangoes"]
    mode = index % 8
    case = index // 8
    number_grid = 20 * 20
    small_grid = 12 * 12

    if mode == 0:
        start = 1 + (case % 20)
        change = 1 + ((case // 20) % 20)
        persona = case // number_grid
        name = names[persona % len(names)]
        item = objects[(persona // len(names)) % len(objects)]
        answer_value = start + change
        prompt = (
            f"{name} has {_item_phrase(item, start)} and buys {change} more. "
            f"How many {item} does {name} have, and why?"
        )
        explanation = (
            f"{name} has {_item_phrase(item, answer_value)}. The quantity increases, "
            f"so add {start} and {change} to get {answer_value}."
        )
    elif mode == 1:
        start = 2 + (case % 20)
        used = 1 + ((case // 20) % min(start - 1, 20))
        persona = case // number_grid
        name = names[persona % len(names)]
        item = objects[(persona // len(names)) % len(objects)]
        answer_value = start - used
        prompt = (
            f"{name} has {_item_phrase(item, start)} and gives away {_item_phrase(item, used)}. "
            f"How many {item} remain?"
        )
        remaining_verb = "remains" if answer_value == 1 else "remain"
        explanation = (
            f"{_item_phrase(item, answer_value)} {remaining_verb}. Giving away means subtracting: "
            f"{start} minus {used} equals {answer_value}."
        )
    elif mode == 2:
        groups = 1 + (case % 12)
        each = 1 + ((case // 12) % 12)
        persona = case // small_grid
        name = names[persona % len(names)]
        item = objects[(persona // len(names)) % len(objects)]
        answer_value = groups * each
        prompt = (
            f"{name} packs {groups} bags with {_item_phrase(item, each)} in each bag. "
            "What is the total?"
        )
        explanation = (
            f"The total is {_item_phrase(item, answer_value)}. Equal groups use multiplication: "
            f"{groups} times {each} equals {answer_value}."
        )
    elif mode == 3:
        groups = 1 + (case % 12)
        each = 1 + ((case // 12) % 12)
        persona = case // small_grid
        name = names[persona % len(names)]
        item = objects[(persona // len(names)) % len(objects)]
        total = groups * each
        prompt = f"{name} shares {total} {item} equally among {groups} people. How many does each person get?"
        explanation = (
            f"Each person gets {_item_phrase(item, each)}. Equal sharing uses division: "
            f"{total} divided by {groups} equals {each}."
        )
    elif mode == 4:
        discounts = [5, 10, 15, 20, 25, 30, 35, 40]
        price = 240000 + 5000 * (case % 20)
        discount = discounts[(case // 20) % len(discounts)]
        final_price = price * (100 - discount) // 100
        item = ["laptop", "generator", "server rack", "printer", "phone set"][(case // (20 * len(discounts))) % 5]
        prompt = f"A {item} costs {price} naira and gets a {discount} percent discount. What is the final price?"
        explanation = (
            f"The final price is {final_price} naira. A {discount} percent discount removes "
            f"{discount} percent of {price}, so the buyer pays {100 - discount} percent of the original price."
        )
    elif mode == 5:
        ratio_cases = [
            (3, 5, 256, "art club", "science club", "notebooks"),
            (2, 7, 315, "north team", "south team", "tickets"),
            (4, 9, 390, "clinic A", "clinic B", "bottles"),
            (5, 6, 440, "morning class", "evening class", "pencils"),
            (7, 8, 600, "warehouse one", "warehouse two", "mangoes"),
        ]
        left, right, total, left_name, right_name, item = ratio_cases[case % len(ratio_cases)]
        unit = total // (left + right)
        left_value = unit * left
        right_value = unit * right
        prompt = (
            f"A store shares {total} {item} between {left_name} and {right_name} "
            f"in the ratio {left}:{right}. How many {item} does {left_name} receive?"
        )
        explanation = (
            f"{left_name.capitalize()} receives {_item_phrase(item, left_value)}. "
            f"The ratio has {left + right} total parts, each part is {unit}, "
            f"so {left} parts gives {left_value} and the other side receives {right_value}."
        )
    elif mode == 6:
        algebra_cases = [
            (7, 11, 67),
            (5, 18, 73),
            (9, 14, 95),
            (6, 25, 91),
            (8, 7, 103),
            (4, 31, 83),
        ]
        coefficient, constant, result = algebra_cases[case % len(algebra_cases)]
        value = (result - constant) // coefficient
        prompt = f"Solve {coefficient}x + {constant} = {result}. What is x?"
        explanation = (
            f"x is {value}. Subtract {constant} from both sides to get "
            f"{result - constant}, then divide that result by {coefficient}."
        )
    else:
        starting = 1000 + 25 * (case % 30)
        received = 120 + 10 * ((case * 11 + 14) % 20)
        used = 80 + 5 * ((case * 7 + 9) % 16)
        final_value = starting + received - used
        resource = ["liters of water", "bags of rice", "meters of cable", "library books"][(case * 7 + 1) % 4]
        prompt = (
            f"A depot starts with {starting} {resource}, receives {received} more, "
            f"and sends out {used}. How many {resource} remain?"
        )
        explanation = (
            f"{final_value} {resource} remain. Add the incoming amount first: "
            f"{starting} plus {received} equals {starting + received}, then subtract {used}."
        )
    return _record(
        category="math",
        index=index,
        split=split,
        prompt=prompt,
        answer=explanation,
        reasoning_summary="Identify the operation, compute exactly, and explain the result in words.",
        safety_label="safe",
        config=config,
    )


def _build_character_counting(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    words = [
        "blackcurrant",
        "mississippi",
        "bookkeeper",
        "committee",
        "array",
        "parallel",
        "occurrence",
        "recurrent",
        "curriculum",
        "banana",
        "engineering",
        "accessibility",
        "reliability",
        "balloon",
        "peppermill",
        "copperroof",
        "riverstone",
        "harmattan",
        "letterpress",
        "grassroots",
        "millennium",
        "tomorrowland",
        "datastructure",
        "microcontroller",
    ]
    word = words[index % len(words)]
    letters = sorted(set(word.casefold()))
    if index % 5 == 0:
        target = "z" if "z" not in letters else "q"
    else:
        target = letters[(index // len(words)) % len(letters)]
    count = word.casefold().count(target.casefold())
    prompt_variants = [
        "How many times does the letter '{letter}' appear in the word '{word}'?",
        "Count the character '{letter}' in the single word '{word}'.",
        "For the word '{word}', how many '{letter}' characters are there?",
        "Scan '{word}' and count every exact '{letter}' character.",
        "Without splitting the word into meanings, count '{letter}' in '{word}'.",
        "Treat uppercase and lowercase as the same: how many '{letter}' characters are in '{word}'?",
    ]
    prompt = prompt_variants[(index // len(words)) % len(prompt_variants)].format(
        letter=target,
        word=word,
    )
    answer_variant = (index // len(words)) % 6
    plural = "" if count == 1 else "s"
    if answer_variant == 1:
        answer = f"'{word}' contains {count} '{target}' character{plural}. Treat the entry as one word, so the count comes from the visible letters."
    elif answer_variant == 2:
        answer = f"In '{word}', '{target}' appears {count} time{plural}. This is about the written characters, not pronunciation."
    elif answer_variant == 3:
        answer = f"The count is {count}: the single word is '{word}' and the requested character is '{target}'."
    elif answer_variant == 4:
        answer = f"Answer: {count}. The word is '{word}', and the requested character is '{target}', so meanings do not change it."
    elif answer_variant == 5:
        answer = f"For '{word}', the requested character '{target}' appears {count} time{plural}. Keep the word as one visible sequence."
    else:
        answer = f"There {'is' if count == 1 else 'are'} {count} '{target}' character{plural} in '{word}'. Read the word as one written sequence."
    return _record(
        category="character_counting",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Scan the single word character by character and count exact case-insensitive matches.",
        safety_label="safe",
        config=config,
    )


def _build_writing_email(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    contexts = [
        (
            "missing a meeting",
            "Meeting follow-up",
            "I'm sorry I missed the meeting.",
            "I value the discussion and would like to catch up on the decisions and next steps.",
            "Would tomorrow afternoon work for a short reschedule?",
        ),
        (
            "delivering a report late",
            "Revised report timing",
            "I'm sorry the report is later than planned.",
            "The revised version is being checked now so the final copy is accurate and useful.",
            "I will send it by Friday and flag any remaining risks clearly.",
        ),
        (
            "asking a mentor for feedback",
            "Feedback request",
            "I hope you are well.",
            "I have a short project proposal and would appreciate your honest feedback on the direction.",
            "Could you review it when you have a convenient moment?",
        ),
        (
            "following up after an interview",
            "Thank you for the interview",
            "Thank you for taking the time to speak with me.",
            "I enjoyed learning more about the role and the problems your team is solving.",
            "I remain interested and would be glad to provide anything else you need.",
        ),
        (
            "requesting clarification from a client",
            "Clarification on scope",
            "Thank you for the update.",
            "I want to make sure the work matches your expectation before we move further.",
            "Could you confirm the deadline and the final scope?",
        ),
        (
            "delayed reply",
            "Reply to your question",
            "I'm sorry for the delayed reply.",
            "Thank you for your patience; the answer is that we can proceed once the final details are confirmed.",
            "Please let me know if you want the shorter option or the more detailed plan.",
        ),
        (
            "delayed project update",
            "Project update",
            "I'm sorry for the delayed project update.",
            "The current status is that the main work is progressing, the open risk is timing, and the next checkpoint is the review build.",
            "I will send a cleaner status note today and keep the next update on schedule.",
        ),
        (
            "late status report",
            "Status report follow-up",
            "I'm sorry the status report arrived late.",
            "The current status is stable, but I wanted to verify the numbers before sending the summary.",
            "I will share the confirmed report and call out anything that still needs a decision.",
        ),
    ]
    topic, subject, opening, body, ask = contexts[index % len(contexts)]
    prompt_styles = [
        "Write a short professional email about {topic}.",
        "Draft a concise email for {topic}.",
        "Create a polite work email handling {topic}.",
        "Write a clear email that addresses {topic}.",
        "Give me a professional email about {topic}.",
    ]
    style_index = (index // len(contexts)) % len(prompt_styles)
    prompt = prompt_styles[style_index].format(topic=topic)
    answer_styles = [
        (
            f"Subject: {subject}\n\nHello,\n\n{opening} {body} {ask}\n\nBest regards,"
        ),
        (
            f"Subject: {subject}\n\nHi,\n\nA quick note on {topic}: {opening} {body} "
            f"{ask}\n\nThank you,"
        ),
        (
            f"Subject: {subject}\n\nHello,\n\nI wanted to share a clear update about {topic}. "
            f"{opening} {body} {ask}\n\nKind regards,"
        ),
        (
            f"Subject: {subject}\n\nHi,\n\nThank you for your patience on {topic}. "
            f"{opening} {body} {ask}\n\nSincerely,"
        ),
        (
            f"Subject: {subject}\n\nHello,\n\nHere is the concise version for {topic}: "
            f"{opening} {body} {ask}\n\nBest,"
        ),
    ]
    answer = answer_styles[style_index % len(answer_styles)]
    return _record(
        category="writing_email",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Produce a concise email with subject, apology or purpose, clear ask, and polite close.",
        safety_label="safe",
        config=config,
    )


def _build_long_context_recall(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "Remember these facts while answering: the red key opens the archive, the blue key opens the lab, and Mara owns the blue key. Which room can Mara open?",
            "Mara can open the lab. The relevant fact is that the blue key opens the lab, and Mara owns the blue key.",
            "Track key ownership and answer from the supplied facts.",
        ),
        (
            "Use this mini-brief: Project Nile shipped on Monday, latency dropped by 31%, and the blocker is documentation. Write the next update.",
            "Project Nile shipped on Monday, and the strongest result is a 31% latency drop. The remaining blocker is documentation, so the next update should confirm the shipped status, name the latency win, and assign the documentation follow-up.",
            "Turn a compact project brief into a clear status update.",
        ),
        (
            "Context: the green badge opens the clinic room, the black badge opens storage, and Lena has the green badge. Which room can Lena open?",
            "Lena can open the clinic room. The answer comes from matching Lena's green badge to the rule that the green badge opens the clinic room.",
            "Recall a named fact from a compact context without using the distractor.",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer, summary = priority_cases[index]
        return _record(
            category="long_context_recall",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary=summary,
            safety_label="safe",
            config=config,
        )

    index -= len(priority_cases)
    if index % 2 == 0:
        case = index // 2
        people = ["Amina", "Tunde", "Chika", "Maya", "Omar", "Lena", "Kofi", "Nkechi"]
        items = ["bronze key", "silver compass", "green receipt", "clinic badge", "seed ledger", "field radio"]
        locations = ["blue jar", "top shelf", "red basket", "locked drawer", "canvas pouch", "glass cabinet"]
        distractors = ["red notebook", "yellow scarf", "market receipt", "repair list", "spare cable", "folded map"]
        distractor_locations = ["top shelf", "blue jar", "wooden tray", "lower drawer", "canvas pouch", "glass cabinet"]
        actor = people[case % len(people)]
        helper = people[(case + 1) % len(people)]
        asker = people[(case + 2) % len(people)]
        item = items[(case // len(people)) % len(items)]
        location = locations[(case // (len(people) * len(items))) % len(locations)]
        distractor = distractors[(case // 3) % len(distractors)]
        distractor_location = distractor_locations[(case // 5) % len(distractor_locations)]
        prompt_styles = [
            (
                "Context: {actor} placed the {item} in the {location}. "
                "{helper} moved the {distractor} to the {distractor_location}. "
                "Later, {asker} asked: Where is the {item}? Answer only from the context."
            ),
            (
                "Read this memory note: {actor} put the {item} inside the {location}. "
                "{helper} handled the {distractor} and left it at the {distractor_location}. "
                "Question: Where is the {item}?"
            ),
            (
                "Use the passage to answer. Passage: The {item} was kept in the {location} by {actor}. "
                "The {distractor} was separate and went to the {distractor_location}. "
                "Question: What location contains the {item}?"
            ),
        ]
        prompt = prompt_styles[(case // 7) % len(prompt_styles)].format(
            actor=actor,
            helper=helper,
            asker=asker,
            item=item,
            location=location,
            distractor=distractor,
            distractor_location=distractor_location,
        )
        answer = (
            f"The {item} is in the {location}. The clue tying {item} to {location} is the relevant detail; "
            f"the {distractor} belongs to a separate thread."
        )
        return _record(
            category="long_context_recall",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Track the named item across a longer passage and answer only from context.",
            safety_label="safe",
            config=config,
        )

    people = ["Nkechi", "Daniel", "Aisha", "Omar", "Lena", "Kofi"]
    cities = ["Osogbo", "Lagos", "Kano", "Accra", "Nairobi", "Ibadan"]
    tools = ["blue notebook", "solar lamp", "field radio", "water filter", "repair kit", "seed ledger"]
    days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
    tool = tools[index % len(tools)]
    person = people[(index // len(tools)) % len(people)]
    city = cities[(index // (len(tools) * len(people))) % len(cities)]
    day = days[(index // (len(tools) * len(people) * len(cities))) % len(days)]
    prompt = (
        f"Read the passage and answer the question. Passage: {person} traveled to {city} on {day}. "
        f"Before leaving, {person} packed a {tool}, two maps, and a sealed lunch. The maps were for the driver, "
        f"but the {tool} was meant for the evening workshop. Question: What item was meant for the evening workshop?"
    )
    answer = (
        f"The {tool} was meant for the evening workshop. The maps are a distractor; "
        f"{person} packed the {tool}, and that earlier detail identifies the workshop item."
    )
    return _record(
        category="long_context_recall",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Hold earlier details in memory and answer only from the passage.",
        safety_label="safe",
        config=config,
    )


def _build_world_knowledge(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "Explain plate tectonics to a curious 12-year-old using a clear analogy.",
            "Imagine Earth's surface as cracked shell pieces floating on very slow, hot syrup. Those pieces are tectonic plates. They move only a little each year, but when they push, pull apart, or slide past each other, they can build mountains, open ocean floors, cause earthquakes, or feed volcanoes.",
        ),
        (
            "Teach plate tectonics with a simple analogy for a young learner.",
            "Think of Earth like a giant puzzle whose pieces move slowly on softer rock underneath. The moving pieces are plates. Where they meet, they can press upward into mountains, split apart to make new crust, or rub suddenly and shake the ground.",
        ),
        (
            "Explain earthquakes and tectonic plates without making it scary.",
            "Earthquakes happen when rocks along plate boundaries get stuck while the plates keep moving. Stress builds, then the rocks slip and release energy as shaking. The idea is simple: slow movement stores force, and sudden release makes the quake.",
        ),
        (
            "Explain why volcanoes often appear near plate boundaries.",
            "Volcanoes often appear near plate boundaries because moving plates can pull crust apart or push one plate beneath another. Both situations help melted rock find a path upward, so heat from inside Earth reaches the surface as lava and ash.",
        ),
        (
            "Summarize why public-key cryptography matters for everyday internet safety.",
            "Public-key cryptography lets strangers on the internet start a protected conversation without sharing a secret first. It helps websites prove their identity, protects login and payment traffic, and makes it harder for attackers to read or silently change data in transit.",
        ),
        (
            "Explain public-key cryptography in plain language.",
            "Public-key cryptography uses a pair of linked keys: one can be shared publicly and the other stays private. That pairing helps people verify identities, exchange secure messages, and protect everyday web traffic without handing the private key to everyone.",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer = priority_cases[index]
        return _record(
            category="world_knowledge",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Explain stable science in concrete language with an analogy and without overstating the simplification.",
            safety_label="safe",
            config=config,
        )

    index -= len(priority_cases)
    topics = [
        ("plate tectonics", "Earth's outer shell is broken into slow-moving plates that build mountains, open ocean floors, trigger earthquakes, and feed some volcanoes"),
        ("the water cycle", "evaporation lifts water vapor, condensation forms clouds, and precipitation returns water to land and sea"),
        ("photosynthesis", "plants use light, water, and carbon dioxide to make sugars while releasing oxygen"),
        ("vaccination", "a vaccine trains immune memory so the body can respond faster to a real infection"),
        ("electric circuits", "current needs a closed path, a source of energy, and components that shape resistance"),
        ("democracy", "citizens choose representatives, institutions limit power, and public accountability matters"),
        ("soil erosion", "wind or water removes topsoil, especially when vegetation and roots no longer hold it"),
        ("supply and demand", "prices tend to rise when demand grows faster than supply and fall when supply outpaces demand"),
        ("cloud computing", "remote servers provide storage and computation that users access over a network"),
        ("databases", "tables or documents store structured facts, indexes speed up lookup, and transactions protect consistency"),
        ("internet routing", "packets move through networks by following routes chosen by routers and updated by routing protocols"),
        ("cybersecurity basics", "systems improve when authentication, patching, backups, least privilege, and monitoring work together"),
        ("climate versus weather", "weather describes short term conditions while climate summarizes long term patterns"),
        ("antibiotics", "antibiotics treat bacterial infections but do not work on viruses and should be used responsibly"),
        ("gravity and orbits", "gravity pulls objects together while sideways motion can keep a satellite falling around a planet"),
        ("accounting basics", "income, expenses, assets, liabilities, and cash flow describe different parts of financial health"),
        ("project management", "clear goals, owners, dependencies, and review points help teams finish complex work"),
        ("agriculture and soil health", "healthy soil keeps nutrients, water, microbes, and roots working together"),
        ("public health", "public health reduces risk across communities through prevention, surveillance, education, and access"),
        ("encryption", "encryption transforms readable information into protected form so only authorized keys can read it"),
        ("renewable energy", "solar, wind, hydro, and geothermal sources can produce energy while reducing fuel dependence"),
        ("machine learning", "models find patterns from data, but evaluation must check whether those patterns generalize"),
        ("datacenter cooling", "servers create heat, so airflow, liquid cooling, and efficient layout protect performance"),
        ("supply chains", "materials, transport, inventory, and demand signals connect producers to customers"),
        ("civic rights", "rights set boundaries around what institutions may do and what people can claim or defend"),
        ("compilers", "a compiler reads source code, checks structure, transforms it, and emits a form the machine can run"),
        ("operating systems", "an operating system manages processes, memory, files, devices, permissions, and user programs"),
        ("APIs", "an API defines how software components ask for data or actions without depending on private internals"),
        ("unit testing", "unit tests exercise small pieces of behavior so regressions become visible when code changes"),
        ("time complexity", "time complexity describes how work grows as input size grows, helping engineers compare algorithms"),
        ("electric power grids", "generation, transmission, distribution, and control systems keep electricity balanced with demand"),
        ("semiconductors", "semiconductors use controlled electrical properties to build transistors, sensors, and chips"),
        ("large language models", "language models estimate text patterns from data, but useful systems also need evaluation, safety, and grounding"),
        ("robotics", "robots combine sensing, planning, control, and mechanical design to act in the physical world"),
        ("epidemiology", "epidemiology studies how disease spreads, what increases risk, and which interventions reduce harm"),
        ("contract law basics", "contracts depend on offer, acceptance, consideration, capacity, and enforceable terms"),
        ("central banks", "central banks influence money supply, interest rates, payment stability, and inflation expectations"),
    ]
    if index % 7 == 2:
        bundles = [
            (
                "clean water, vaccination, and roads",
                "public health",
                "clean water lowers disease risk, vaccination builds immune memory, and roads help clinics, vaccines, food, and emergency teams reach people",
            ),
            (
                "schools, libraries, and internet access",
                "education",
                "schools organize learning, libraries preserve shared knowledge, and internet access helps learners reach current information and mentors",
            ),
            (
                "farm storage, weather forecasts, and market roads",
                "food security",
                "storage reduces waste, forecasts help farmers plan, and market roads move food before it spoils",
            ),
            (
                "identity documents, banking access, and cybersecurity",
                "digital trust",
                "identity documents establish who is acting, banking access supports payments, and cybersecurity protects accounts from abuse",
            ),
        ]
        bundle, domain, summary = bundles[(index // 7) % len(bundles)]
        prompt_styles = [
            "Summarize why {bundle} matter for {domain}.",
            "Explain how {bundle} work together in {domain}.",
            "Give a plain-language overview connecting {bundle} to {domain}.",
            "Why should a learner connect {bundle} when thinking about {domain}?",
        ]
        prompt = prompt_styles[(index // (7 * len(bundles))) % len(prompt_styles)].format(
            bundle=bundle,
            domain=domain,
        )
        answer = (
            f"For {domain}, {summary}. The shared pattern is that practical systems work together: "
            "one part prevents harm, another improves access, and another keeps services reachable when people need them."
        )
    elif index % 5 == 1:
        paired_topics = [
            (
                "cloud computing",
                "databases",
                "cloud computing provides remote compute and storage, while databases organize the facts applications need to retrieve, update, and protect",
            ),
            (
                "public health",
                "vaccination",
                "public health looks at community risk, while vaccination builds immune memory that lowers severe disease and spread",
            ),
            (
                "internet routing",
                "encryption",
                "routing moves packets across networks, while encryption protects the information inside those packets from unauthorized readers",
            ),
            (
                "renewable energy",
                "datacenter cooling",
                "renewable energy can reduce fuel dependence, while efficient cooling keeps servers reliable with less wasted power",
            ),
            (
                "soil health",
                "agriculture",
                "soil health supports roots, water, nutrients, and microbes, which makes agriculture more resilient over time",
            ),
            (
                "unit testing",
                "debugging",
                "unit testing catches expected behavior automatically, while debugging traces why actual behavior drifted",
            ),
            (
                "APIs",
                "databases",
                "APIs shape requests and responses, while databases persist the facts those requests need",
            ),
            (
                "compilers",
                "operating systems",
                "compilers prepare programs for execution, while operating systems schedule and protect running programs",
            ),
            (
                "large language models",
                "evaluation",
                "language models can sound fluent, while evaluation checks whether answers are correct, safe, and useful",
            ),
        ]
        left, right, summary = paired_topics[(index // 5) % len(paired_topics)]
        matter_verb = "matter" if right.endswith("s") else "matters"
        prompt_styles = [
            "Explain {left} and why {right} {matter_verb}.",
            "Connect {left} with {right} for a curious learner.",
            "Compare {left} and {right} in plain language.",
            "Why do {left} and {right} matter together?",
            "Give a clear summary linking {left} to {right}.",
        ]
        prompt = prompt_styles[
            (index // (5 * len(paired_topics))) % len(prompt_styles)
        ].format(
            left=left,
            right=right,
            matter_verb=matter_verb,
        )
        answer = (
            f"{left.capitalize()} and {right} connect through practical systems: {summary}. "
            "A strong explanation defines both ideas, shows how they support each other, and stays clear about limits."
        )
    else:
        topic, summary = topics[index % len(topics)]
        prompt_styles = [
            "Summarize {topic} clearly for a curious learner.",
            "Explain {topic} in plain language.",
            "What should a learner understand about {topic}?",
            "Give a careful overview of {topic}.",
            "Describe {topic} with useful context.",
            "Teach the basics of {topic} without hype.",
        ]
        prompt = prompt_styles[
            (index // len(topics)) % len(prompt_styles)
        ].format(topic=topic)
        answer = (
            f"{topic.capitalize()} can be understood through its main mechanism: {summary}. "
            "The useful explanation connects the parts, shows why the process matters, and keeps the limits of the summary visible."
        )
    return _record(
        category="world_knowledge",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Give stable world knowledge with careful wording and no unsupported certainty.",
        safety_label="safe",
        config=config,
    )


def _build_physical_reasoning(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    cases = [
        (
            "glass",
            "ladder",
            "it is brittle under impact and bending, so a small crack can become a sudden break",
            "aluminum",
            "it is light, tough, and easier to inspect for damage",
        ),
        (
            "glass",
            "bridge railing",
            "it can handle compression in the right design but needs careful support because edges and point impacts are weak spots",
            "laminated safety glass with steel supports",
            "the layers hold fragments together while the frame carries load",
        ),
        (
            "glass",
            "greenhouse roof",
            "hail, wind, and thermal stress can turn a rigid sheet into a fracture risk",
            "polycarbonate",
            "it stays transparent while handling impact and flex better",
        ),
        (
            "ceramic",
            "bicycle frame",
            "ceramics resist heat and wear but are poor at surviving vibration, torsion, and sudden impacts",
            "steel",
            "it bends before it breaks and tolerates repeated road loads",
        ),
        (
            "glass",
            "bicycle frame",
            "the frame would face vibration, bending, side impacts, and hidden cracks that can grow without warning",
            "aluminum",
            "it is light, ductile, repairable, and proven for repeated road loads",
        ),
        (
            "glass",
            "scooter deck",
            "rider weight and pavement shocks would concentrate stress around scratches and mounting holes",
            "aluminum",
            "it handles impact and bending while staying light enough to carry",
        ),
        (
            "ceramic",
            "wheel rim",
            "road impacts can chip or shatter brittle material at the edge",
            "aluminum alloy",
            "it balances weight, toughness, and serviceability",
        ),
        (
            "paper",
            "rain shelter",
            "water weakens the fibers, and wind can tear the structure before it protects anyone",
            "coated fabric over a metal frame",
            "the fabric sheds water while the frame carries tension",
        ),
        (
            "rubber",
            "kitchen knife blade",
            "rubber flexes instead of holding a sharp cutting edge",
            "stainless steel",
            "it keeps an edge, resists corrosion, and can be cleaned safely",
        ),
        (
            "wood",
            "bicycle brake disc",
            "friction heat and moisture would change the surface and make stopping unreliable",
            "stainless steel",
            "it handles heat and keeps predictable friction",
        ),
        (
            "concrete",
            "drone frame",
            "the mass would overwhelm the motors and reduce control",
            "carbon fiber",
            "it gives high stiffness with low weight",
        ),
        (
            "paper",
            "drone frame",
            "paper loses stiffness with moisture and repeated vibration, so the frame can twist before the controller can stabilize it",
            "carbon fiber",
            "it stays light while resisting bending and vibration",
        ),
        (
            "cardboard",
            "drone arm",
            "the arm must hold motor thrust without twisting, and cardboard crushes or delaminates under repeated load",
            "carbon fiber tube",
            "it keeps the motor aligned while adding little mass",
        ),
        (
            "paper",
            "model aircraft wing",
            "paper can work for a tiny indoor model but tears easily and changes shape with humidity",
            "balsa wood or foam board",
            "they stay light while holding a cleaner airfoil",
        ),
        (
            "clay",
            "water pipe",
            "it can crack under ground movement or pressure spikes unless heavily protected",
            "ductile iron or HDPE",
            "they tolerate pressure changes and movement better",
        ),
        (
            "thin plastic",
            "cooking pot handle",
            "heat can soften or deform it near the flame",
            "heat-rated polymer or wood",
            "it insulates the hand while surviving normal cooking temperatures",
        ),
        (
            "bamboo",
            "bicycle frame",
            "it can work only when joints, sealing, and fiber direction are engineered carefully",
            "aluminum",
            "it gives predictable mass production and easier quality control",
        ),
        (
            "aluminum",
            "boat anchor",
            "low weight works against the job because an anchor needs mass and grip",
            "galvanized steel",
            "it adds weight, strength, and corrosion resistance",
        ),
        (
            "foam",
            "load-bearing table leg",
            "it compresses and buckles under steady load",
            "hardwood or steel",
            "they keep shape under compression and side force",
        ),
        (
            "copper",
            "phone case",
            "it conducts heat and electricity well but adds cost, weight, and signal shielding problems",
            "polycarbonate",
            "it is light, protective, and radio-friendly",
        ),
        (
            "stone",
            "running shoe sole",
            "it is too heavy and rigid for repeated foot impact",
            "rubber",
            "it grips the ground and cushions motion",
        ),
        (
            "carbon fiber",
            "playground slide",
            "it is strong but expensive and can splinter if damaged",
            "molded plastic or stainless steel",
            "they are easier to inspect, clean, and maintain for public use",
        ),
    ]
    material, object_name, risk, alternative, advantage = cases[index % len(cases)]
    prompt_styles = [
        "Explain why a {material} {object_name} would be hard to build, then give one safer material.",
        "A designer proposes a {material} {object_name}. Explain the main failure risk and one better material.",
        "For a {object_name}, compare {material} with a safer alternative and give the reason.",
        "Give a practical engineering answer about whether {material} works for a {object_name}.",
        "Why might {material} be a poor choice for a {object_name}, and what should replace it?",
        "Explain whether a {material} {object_name} is practical, then name one safe alternative material.",
    ]
    answer_styles = [
        (
            "A {material} {object_name} is difficult because {risk}. "
            "A safer material is {alternative}, because {advantage}."
        ),
        (
            "The main issue is not the idea itself; it is the material behavior. "
            "For a {object_name}, {material} has this risk: {risk}. "
            "{capitalized_alternative} is safer because {advantage}."
        ),
        (
            "{capitalized_material} can be useful in some designs, but for a {object_name} the important load case matters. "
            "Here, {risk}. I would choose {alternative}, since {advantage}."
        ),
        (
            "A practical design answer checks load, impact, heat, moisture, and repair. "
            "For a {material} {object_name}, the weak point is that {risk}. "
            "A better choice is {alternative}: {advantage}."
        ),
        (
            "I would treat the {material} {object_name} as risky because {risk}. "
            "The safer replacement is {alternative}, which works better because {advantage}."
        ),
    ]
    prompt = prompt_styles[
        (index // len(cases)) % len(prompt_styles)
    ].format(material=material, object_name=object_name)
    answer = answer_styles[
        (index // (len(cases) * len(prompt_styles))) % len(answer_styles)
    ].format(
        material=material,
        capitalized_material=material.capitalize(),
        capitalized_alternative=alternative.capitalize(),
        object_name=object_name,
        risk=risk,
        alternative=alternative,
        advantage=advantage,
    )
    return _record(
        category="physical_reasoning",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Use material properties, load cases, and safer alternatives without pretending every design has one universal answer.",
        safety_label="safe",
        config=config,
    )


def _build_tool_use_grounding(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "Explain why source-grounded tool use can help with unknown topics.",
            "Source-grounded tool use helps because the model can separate stable reasoning from facts that need evidence. If the topic is unknown, current, local, or source-specific, a tool result gives something concrete to read, cite, and check instead of filling gaps from memory.",
            "Explain tool use as evidence gathering for unknown, current, or source-specific questions.",
        ),
        (
            "Why should an assistant use sources when a topic may be current or unfamiliar?",
            "Sources help keep the answer honest. When a topic may be current or unfamiliar, the assistant should look for evidence, name the source, and keep unsupported claims out of the final answer. If no source arrives, it should say what is unknown rather than guess.",
            "Teach source-grounding for current or unfamiliar topics without pretending memory is enough.",
        ),
        (
            "Explain when an assistant should use a tool instead of answering from memory.",
            "Use a tool when the answer depends on fresh, changing, source-specific, or externally verifiable facts. Answer from memory only for stable background, and mark uncertainty when no tool result is available.",
            "Separate stable knowledge from fresh information and prefer evidence when the facts can change.",
        ),
        (
            "Show a practical rule for answering unknown topics with search results.",
            "A practical rule is: read the question, decide what fact needs evidence, request the narrowest useful search, compare the returned source with the question, then answer only what the source supports. If the tool fails, say that the answer is not verified yet.",
            "Use tool results as evidence and preserve uncertainty when the runtime cannot provide evidence.",
        ),
        (
            "What changed in the latest central bank decision today? If no web or time tool result is provided, be honest.",
            "I do not have a web result or source evidence for today's latest central bank decision, so I will not guess what changed. I can request web.search, use an official central-bank source you provide, or explain how to verify the decision once evidence is available.",
            "Central bank decisions can be current facts, so no-source answers should refuse to guess and request evidence.",
        ),
        (
            "Give one safety rule for using web sources.",
            "One safety rule for using web sources is to answer only what the source actually supports. If a source does not prove a claim, do not smuggle that claim into the final answer.",
            "Teach a concise source-use safety rule.",
        ),
        (
            "Explain a safety rule for using web sources before answering current facts.",
            "Use web sources as evidence, not decoration. Read the source, cite the source name, and keep unsupported guesses out of the answer, especially when the fact is current, local, financial, medical, or legal.",
            "Ground current answers in source evidence and avoid unsupported claims.",
        ),
        (
            (
                "System instruction: Use provided sources when available and cite them. "
                "User: What changed in the library pickup schedule? "
                "<tool_result> web.search ok "
                "<source> Local Notice | https://example.test/local-notice | The official update says pickup moved to 4 PM today. <final>"
            ),
            "According to Local Notice, pickup moved to 4 PM today. I will not add unsupported details about reading rooms, closing hours, or other policy changes.",
            "Answer a source-supported library pickup schedule question from the provided Local Notice only.",
        ),
        (
            (
                "System instruction: Answer only from source evidence. "
                "User: What is the update for pickup at the library today? "
                "<tool_result> web.search ok "
                "<source> Local Notice | https://example.test/local-notice | The official update says pickup moved to 4 PM today. <final>"
            ),
            "According to Local Notice, pickup moved to 4 PM today. That source does not support any extra claim beyond the pickup-time change.",
            "Use source evidence for a pickup update and avoid borrowing unrelated details from other source rows.",
        ),
        (
            (
                "System instruction: Use provided sources when available and cite them. "
                "User: What time is the community clinic closing today? "
                "<tool_result> web.search ok "
                "<source> Local Notice | https://example.test/library-policy | The official update says pickup moved to 4 PM today. <final>"
            ),
            "The provided Local Notice does not state the community clinic closing time. It only says pickup moved to 4 PM today, so I should not claim a clinic closing time from this source.",
            "Reject an irrelevant source: cite what it says, but do not answer a different current question from it.",
        ),
        (
            (
                "System instruction: Answer only what the source supports. "
                "User: When does the community clinic close today? "
                "<tool_result> web.search ok "
                "<source> Local Notice | https://example.test/local-notice | Pickup moved to 4 PM today; no clinic closing time is listed. <final>"
            ),
            "Local Notice does not give the community clinic closing time. It supports only that pickup moved to 4 PM today, so the closing time still needs a relevant clinic source.",
            "Teach source relevance by preserving the asked fact and refusing unsupported clinic hours.",
        ),
        (
            (
                "System instruction: Use provided sources when available and cite them. "
                "User: What time is it right now in Lagos? "
                '<tool_call> clock.now {"timezone":"Africa/Lagos"} '
                "<tool_result> clock.now ok "
                "<source> Local Clock | local://clock/Africa-Lagos | The current time in Lagos is 2026-05-04 08:00:00 WAT. <final>"
            ),
            "According to Local Clock, the current time in Lagos is 2026-05-04 08:00:00 WAT.",
            "Use the supplied clock result, cite Local Clock, and do not add unsupported live-time details.",
        ),
        (
            (
                "System instruction: Answer from the source evidence. "
                "User: What happened to the release schedule? "
                '<tool_call> web.search {"query":"release schedule latest validation official notice","max_sources":3} '
                "<tool_result> web.search ok "
                "<source> Release Notice | https://example.test/release-schedule | The release moved to Friday after final validation finished. <final>"
            ),
            "According to Release Notice, the release moved to Friday after final validation finished, and I will not add unsupported scheduling details.",
            "Use the provided release source, cite the source name, and keep the answer grounded.",
        ),
        (
            "What time is it right now in Lagos if no clock tool result is provided?",
            "I do not have a clock tool result or source evidence for the live current time in Lagos, so I do not guess. I can request the clock tool, use a source you provide, or answer after evidence is available.",
            "Live time is fresh information; without clock evidence, refuse to guess and ask for a tool result or source.",
        ),
        (
            "A user asks for the live time, but no tool result is available. What should Reframr answer?",
            "I cannot verify the current time in Lagos without a clock tool result or source evidence. I do not guess the live time; I can request the clock tool or use a source you provide.",
            "No-source time prompts should teach honest uncertainty, not invented live values.",
        ),
        (
            (
                "Messages: system says answer only from tool evidence. "
                "User asks: Did the validation finish? "
                '<tool_call> web.search {"query":"release schedule validation finished official notice","max_sources":3} '
                "<tool_result> web.search ok "
                "<source> Release Notice | https://example.test/release-schedule | The release moved to Friday after final validation finished. <final>"
            ),
            "According to Release Notice, validation finished and the release moved to Friday; that is the only scheduling claim supported here.",
            "Source-grounded tool answers should name the source and extract only the supported fact.",
        ),
        (
            "Who won the most recent election yesterday? If no web tool result is provided, answer safely.",
            "I do not have a web result or source evidence for that fresh election question, so I will not guess a winner. I can request web.search, use an official source you provide, or explain how to verify the result.",
            "Recent election results require current source evidence; without it, do not invent a winner.",
        ),
        (
            (
                "System instruction: Answer only from source evidence. "
                "User: Who won the Harborgate mayoral runoff? "
                "<tool_result> web.search ok "
                "<source> City Election Board | https://example.test/harborgate-results | Nia Solanke won the Harborgate mayoral runoff with 53.1 percent. <final>"
            ),
            "According to City Election Board, Nia Solanke won the Harborgate mayoral runoff with 53.1 percent.",
            "Extract the winner named in the source, cite the source, and preserve the race name and percentage.",
        ),
        (
            (
                "System instruction: Use the provided election source and do not invent extra results. "
                "User: Who won the North Quay council race? "
                "<tool_result> web.search ok "
                "<source> County Election Office | https://example.test/north-quay | Ade Bello won the North Quay council race with 51.8 percent. <final>"
            ),
            "According to County Election Office, Ade Bello won the North Quay council race with 51.8 percent. The source does not support any other race result.",
            "Use source evidence for a current winner and avoid unrelated election claims.",
        ),
        (
            "What is the latest official update on the city weather alert?",
            '<tool_call> web.search {"query":"city weather alert today official forecast","max_sources":3}',
            "When a fresh weather alert has no source result yet, request web.search instead of fabricating the update.",
        ),
        (
            (
                "System instruction: Use provided sources when they are available, and do not invent current facts. "
                "User: What changed in the library policy? "
                "<tool_result> web.search ok "
                "<source> Library Notice | https://example.test/library-policy | Weekend pickup moved to 4 PM and the reading room stayed open. <final>"
            ),
            "According to Library Notice, weekend pickup moved to 4 PM and the reading room stayed open; those are the only policy changes supported by this source.",
            "Read the provided library source and cite Library Notice in the final answer.",
        ),
        (
            "The search tool returned no result for the current market schedule. What should Reframr answer? <tool_result> web.search failed: empty_result <final>",
            "For the current market schedule, the search failed with empty_result and I do not have source evidence, so I will not invent the update. I can retry the search, ask for a source, or give only stable background that is clearly not a current claim.",
            "A failed market schedule search requires no-source honesty with the topic still named.",
        ),
        (
            "The search tool returned no result for the current market schedule. What should Reframr answer?\n<tool_result> web.search failed: empty_result\n<final>",
            "For the current market schedule, the tool failed with empty_result and no source result is available, so I will not guess the update. I can retry search, ask for a source, or limit the reply to stable non-current background.",
            "A failed market schedule search should keep the exact topic visible while refusing to guess.",
        ),
        (
            "A user asks about the current market schedule, but no tool result is available. What is the safe answer?",
            "For the current market schedule, the tool failed or returned no source, so I need evidence before naming the live update. Without source evidence, I will not guess; I can retry search or use a source you provide.",
            "No-source market schedule answers should name the topic and refuse to fabricate fresh facts.",
        ),
        (
            "A tool timed out while checking a software release. Show the next safe move. <tool_result> web.search failed: timeout <final>",
            '<tool_call> web.search {"query":"software release status latest changelog official source","max_sources":3}',
            "After a software-release timeout, retry with a release-specific source query, not an unrelated domain.",
        ),
        (
            (
                "System instruction: Use sources when provided, cite the source name, and keep the answer concise. "
                "User: What changed in the release schedule? "
                '<tool_call> web.search {"query":"release schedule official update","max_sources":3} '
                "<tool_result> web.search ok "
                "<source> Release Notice | https://example.test/release-schedule | The release moved to Friday after final validation finished. "
                "User: What should I do with this source result? <final>"
            ),
            "According to Release Notice, the release moved to Friday after final validation finished. Use that source as evidence and avoid adding unsupported schedule details.",
            "When a source result is already present, answer from it, cite it, and stop without drifting into unrelated facts.",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer, reasoning = priority_cases[index]
        return _record(
            category="tool_use_grounding",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary=reasoning,
            safety_label="safe",
            config=config,
        )

    no_source_cases = [
        (
            "Tell me who won the newest mayoral election yesterday, then explain what you would do if no web result is provided.",
            "I cannot name the winner of a newest mayoral election yesterday without a web result or source evidence. I would retry web.search with the city, date, and official election source, or ask you for a source; without evidence, I can only explain that the answer is unverified.",
        ),
        (
            "Tell me who won the newest mayoral election yesterday, then explain what you would do if no web result is provided.",
            "I do not have source evidence for that newest mayoral election, so I should not invent a winner. The next move is to search official results, narrow by city and date, or use a source you provide before naming anyone.",
        ),
        (
            "Tell me who won the newest mayoral election yesterday, then explain what you would do if no web result is provided.",
            "Without a web result, I cannot verify who won that mayoral election yesterday. I would state the limit clearly, retry web.search against official election pages, and only answer with a cited source.",
        ),
        (
            "Who won the most recent election yesterday? If no web result is provided, answer safely.",
            "I do not have source evidence for a most recent election yesterday, so I will not guess a winner. I would ask for the location or run web.search against official election results before giving a name.",
        ),
        (
            "A user asks who won an election yesterday, but the web search result is missing. What should the answer say?",
            "No source evidence is available for that fresh election result, so I cannot reliably name a winner. I can retry search, use an official result you provide, or give stable background about how to verify election results.",
        ),
        (
            "Tell me the winner of yesterday's local race and explain the no-tool boundary.",
            "A yesterday local-race winner is fresh information, so I need a tool result or official source before naming anyone. If no tool result is available, I should say the result is unverified and ask to search or use a provided source.",
        ),
        (
            "Who won the newest mayoral election yesterday if no source evidence is available?",
            "Without current source evidence, I cannot verify the newest mayoral election result or name the winner. The reliable path is to check official election results, cite the source, and avoid filling the gap from memory.",
        ),
        (
            "Answer a fresh election question with no tool result: who won yesterday?",
            "I cannot answer a fresh election winner from memory when no tool result is present. I would request a source, retry web.search, or clearly say that the current result is unknown to me.",
        ),
    ]
    no_source_case_offset = len(priority_cases)
    if no_source_case_offset <= index < no_source_case_offset + len(no_source_cases):
        prompt, answer = no_source_cases[index - no_source_case_offset]
        return _record(
            category="tool_use_grounding",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Fresh election results require source evidence; without a tool result, do not invent a winner.",
            safety_label="safe",
            config=config,
        )

    fresh_domains = [
        (
            "current news",
            "current news official reliable source today",
            "News Source",
            "https://example.test/current-news",
            "The verified update is available only from the latest source result, not from memory.",
        ),
        (
            "library policy",
            "library policy today official notice",
            "Library Notice",
            "https://example.test/library-policy",
            "The library moved weekend pickup to 4 PM and kept the reading room open.",
        ),
        (
            "software release status",
            "Reframr runtime release status latest changelog",
            "Release Changelog",
            "https://example.test/reframr-release",
            "The runtime update passed smoke tests and needs one more long-context probe.",
        ),
        (
            "weather alert",
            "city weather alert today official forecast",
            "Weather Service",
            "https://example.test/weather-alert",
            "Heavy rain is expected after 6 PM, with travel delays likely near low roads.",
        ),
        (
            "security advisory",
            "current security advisory official patch guidance",
            "Security Advisory",
            "https://example.test/security-advisory",
            "Admins should patch exposed servers and rotate keys after confirming logs.",
        ),
        (
            "market schedule",
            "market schedule today official update",
            "Market Bulletin",
            "https://example.test/market-schedule",
            "The morning session opens one hour later because maintenance is still running.",
        ),
        (
            "current time in Lagos",
            "current time Lagos clock timezone",
            "Local Clock",
            "local://clock/Africa-Lagos",
            "The current time in Lagos is 2026-05-04 08:00:00 WAT.",
        ),
        (
            "release schedule",
            "release schedule latest validation official notice",
            "Release Notice",
            "https://example.test/release-schedule",
            "The release moved to Friday after final validation finished.",
        ),
    ]
    failure_modes = [
        ("timeout", "the search provider timed out before returning evidence"),
        ("rate_limit", "the provider refused the request because the rate limit was reached"),
        ("empty_result", "the tool returned no useful source for the question"),
        ("bad_source", "the first result was a forum rumor rather than a reliable source"),
    ]
    domain, query, title, url, snippet = fresh_domains[index % len(fresh_domains)]
    failure, failure_text = failure_modes[(index // len(fresh_domains)) % len(failure_modes)]
    prompt_styles = [
        "Use fresh information to answer the question about {domain}.",
        "Check current sources before answering about {domain}.",
        "What is the latest official update on {domain}?",
        "What changed in {domain}?",
        "If the tool fails, recover carefully and answer about {domain} only from evidence.",
        "Show how Reframr should use web.search for a fresh {domain} question.",
        "Answer from the tool result and cite the source for {domain}.",
        "Explain when tool use is required for a question about {domain}.",
        "If someone asks for current news and no tool result is available, what should Reframr do?",
        "What should Reframr do when a fresh-information question has no source evidence?",
        "A user asks about {domain}, but no tool result is available. What is the safe answer?",
        "What time is it right now in Lagos if no clock tool result is provided?",
        "A user asks for the live time, but no tool result is available. What should Reframr answer?",
    ]
    style_index = index % len(prompt_styles)
    prompt = prompt_styles[style_index].format(domain=domain)
    if style_index == 8:
        domain = "current news"
        prompt = "If someone asks for current news and no tool result is available, what should Reframr do?"
    elif style_index == 9:
        prompt = "What should Reframr do when a fresh-information question has no source evidence?"
    elif style_index in {11, 12}:
        domain = "current time in Lagos"
    query_json = json.dumps({"query": query, "max_sources": 3}, separators=(",", ":"))
    retry_query_json = json.dumps(
        {"query": f"{query} official source", "max_sources": 3},
        separators=(",", ":"),
    )
    source_claim = (
        f"According to {title}, {snippet} "
        "This is the only current claim supported by that source."
    )
    variant = index % 8
    no_source_prompt = style_index in {8, 9, 10, 11, 12}
    if no_source_prompt:
        if domain == "current time in Lagos":
            no_source_answers = [
                (
                    "I do not have a clock tool result or source evidence for the live current time in Lagos, so I do not guess. "
                    "I can request the clock tool, use a source you provide, or answer after evidence is available."
                ),
                (
                    "I cannot verify the live current time in Lagos without a clock source. I do not guess the live time; I can request the clock tool or use a source you provide."
                ),
                (
                    "No clock source is available for the live current time in Lagos, so I do not guess or invent one. Give me a clock result or a source, and I can answer from that evidence."
                ),
            ]
        elif domain == "market schedule":
            no_source_answers = [
                (
                    "For the current market schedule, the tool failed or returned empty_result and I do not have source evidence, so I will not guess. "
                    "I can retry search, use a source you provide, or give only stable non-current background."
                ),
                (
                    "I cannot verify the current market schedule because the search failed or returned no source evidence. I will not invent the update; I can retry search or use an official source you provide."
                ),
                (
                    "The current market schedule search came back empty_result with no current source evidence, so I will not guess. Give me a source, let me retry the tool, or I can explain only stable context."
                ),
            ]
        else:
            no_source_answers = [
                (
                    "I do not have a tool result or source evidence for that current question, so I cannot name the fresh answer reliably. "
                    "I can retry the search or use a source you provide. Without that evidence, I can only give stable background and label it as non-current."
                ),
                (
                    f"I cannot verify without source evidence: the current {domain} answer needs a source result. "
                    "The honest next move is to retry search, use a source you provide, or keep the reply to clearly marked non-current background."
                ),
                (
                    f"No current source evidence came back for {domain}, so I will not guess. "
                    "Give me a source, let me retry the tool, or I can explain only stable context while marking it as not a fresh update."
                ),
            ]
        answer = no_source_answers[index % len(no_source_answers)]
        reasoning = "No-source fresh-information prompts must teach refusal to fabricate, not synthetic successful tool output."
    elif variant == 0:
        answer = f"<tool_call> web.search {query_json}"
        reasoning = "For a fresh-information prompt without a tool result, request web.search and stop so the runtime can provide evidence."
    elif variant == 1:
        prompt = (
            f"The first web.search attempt for {domain} failed with {failure}. "
            f"Show the retry call. <tool_result> web.search failed: {failure} <final>"
        )
        answer = f"<tool_call> web.search {retry_query_json}"
        reasoning = "Detect the failed tool result and retry once with a narrower query instead of inventing the source result."
    elif variant == 2:
        prompt = (
            f"web.search failed with {failure} for {domain}. What should Reframr answer if no source came back? "
            f"<tool_result> web.search failed: {failure} <final>"
        )
        answer = (
            "I could not get current source evidence from the tool, so I cannot give a reliable fresh answer yet. "
            "I can retry the search, use a source you provide, or give stable background clearly marked as non-current."
        )
        reasoning = "When no current source is available, refuse to fabricate fresh facts and ask for a retry or a source."
    elif variant == 3:
        answer = (
            f"<tool_result> web.search ok "
            f"<source> {title} | {url} | {snippet} "
            f"<final> {source_claim} "
            "That source is the evidence for the current claim; I keep any recommendation separate from the cited fact."
        )
        reasoning = "When a tool result is already present, skip a new call, read the source, cite it, and answer from the provided evidence."
    elif variant == 4:
        prompt = (
            f"System instruction: Use provided sources when they are available, and do not invent current facts. "
            f"User: What changed in {domain}? "
            f"<tool_result> web.search ok "
            f"<source> {title} | {url} | {snippet} <final>"
        )
        answer = (
            f"{source_claim} I am using the supplied tool result as evidence and will not add unsupported current details."
        )
        reasoning = "Read the supplied tool result as context, cite the source, and answer from the evidence already present."
    elif variant == 5:
        prompt = (
            f"The search tool returned no result for the current {domain}. What should Reframr answer? "
            f"<tool_result> web.search failed: empty_result <final>"
        )
        if domain == "market schedule":
            answer = (
                "For the current market schedule, the search failed with empty_result and I do not have source evidence, so I will not invent the update. "
                "I can retry the search, ask for a source, or give only stable background that is clearly not a current claim."
            )
        else:
            answer = (
                "I do not have current source evidence for that fresh question. I will not invent the current update; "
                "I can retry the search, ask for a source, or give only stable background that is clearly not a current claim."
            )
        reasoning = "Treat a missing source as a boundary: do not fabricate fresh facts, offer retry or source-grounded alternatives."
    elif variant == 6:
        prompt = (
            f"A tool timed out while checking {domain}. Show the next safe move. "
            f"<tool_result> web.search failed: timeout <final>"
        )
        answer = f"<tool_call> web.search {retry_query_json}"
        reasoning = "After a timeout, retry once with a narrower query; if it still fails, refuse to invent current evidence."
    else:
        prompt = "Explain when an assistant should use a tool instead of answering from memory."
        answer = (
            "Use a tool when the answer depends on fresh, changing, source-specific, or externally verifiable facts. "
            "Answer from memory only for stable background, and mark uncertainty when no tool result is available."
        )
        reasoning = "Separate stable knowledge from fresh information and prefer evidence when the facts can change."
    return _record(
        category="tool_use_grounding",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary=reasoning,
        safety_label="safe",
        config=config,
    )


def _build_coding_skills(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_tasks = [
        (
            "Write a small Python function that validates a JSON object has name, email, and age fields without crashing.",
            "Python",
            "A clean version is:\n```python\ndef validate_user(obj):\n    if not isinstance(obj, dict):\n        return False\n\n    required = {\"name\": str, \"email\": str, \"age\": int}\n    for key, expected_type in required.items():\n        if key not in obj or not isinstance(obj[key], expected_type):\n            return False\n    return True\n```\nAdd tests for a valid object, a missing field, a wrong type, and a non-dictionary input.",
        ),
        (
            "Create validate_profile(data) for JSON-like input with name, email, and age.",
            "Python",
            "Keep the function defensive: first check that data is a dictionary, then verify that name and email are non-empty strings and age is an integer in a reasonable range. Return a clear boolean or an error list; do not catch every exception and hide bugs.",
        ),
        (
            "Explain how to validate required JSON fields in Python without crashing.",
            "Python",
            "Treat the input as untrusted. Check isinstance(data, dict), test each required key before reading it, validate the expected type, and return structured errors such as {'age': 'must be an integer'} instead of letting KeyError or TypeError escape in normal validation flow.",
        ),
        (
            "Write a practical code review note for a JSON validator that accepts name, email, and age.",
            "code review",
            "The validator needs explicit field checks, type checks, and tests for malformed input. I would reject broad exception swallowing, add cases for missing name, missing email, non-integer age, and non-object input, then keep the return contract stable.",
        ),
        (
            "Explain how you would debug a slow generation loop without blaming the dataset first.",
            "debugging",
            "Start by timing the loop per token, then separate model load time, tokenization, state update, readout, sampling, and text rendering. Check whether any step loops over the vocabulary in Python, whether arrays are reused, and whether cache warmup hides the real bottleneck. Only blame data after the measured hot path is clear.",
        ),
        (
            "A model generates slowly. What should an engineer measure before changing the dataset?",
            "debugging",
            "Measure cold load time, prompt tokenization, recurrent state updates, readout matrix time, transition lookup, sampling, and detokenization. A slow loop usually needs profiling at the kernel boundary before changing the corpus.",
        ),
    ]
    if index < len(priority_tasks):
        prompt, domain, answer = priority_tasks[index]
        return _record(
            category="coding_skills",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary=f"Give a practical {domain} answer with validation, edge cases, and tests.",
            safety_label="safe",
            config=config,
        )

    index -= len(priority_tasks)
    tasks = [
        (
            "Write a Python function that returns the largest number in a list and handles an empty list safely.",
            "Python",
            "Use a clear guard for the empty case, then compare values directly. Example: def largest(values): if not values: raise ValueError('largest() needs at least one value'); best = values[0]; for value in values[1:]: if value > best: best = value; return best. Add unit tests for a normal list, one item, negative numbers, and the empty-list ValueError contract.",
        ),
        (
            "Write a Python function that returns the second largest number in a list, and mention one edge case.",
            "Python",
            "Use distinct values so duplicates do not fake a second result. Example: def second_largest(values): unique = sorted(set(values)); if len(unique) < 2: raise ValueError('second_largest() needs at least two distinct values'); return unique[-2]. Mention the edge case with fewer than two distinct values, such as [] or [5, 5]. Add tests for duplicates, negatives, and a normal mixed list.",
        ),
        (
            "Implement second_largest(values) in Python with tests.",
            "Python",
            "A safe implementation should reject fewer than two distinct values, then return the next value below the maximum. One clear version is: unique = sorted(set(values)); if len(unique) < 2: raise ValueError('need at least two distinct values'); return unique[-2]. Test [3, 1, 8, 8, 5], negative numbers, and the all-duplicates edge case.",
        ),
        (
            "Debug a Python loop that skips items when removing from a list during iteration.",
            "Python debugging",
            "The bug happens because the list changes while the loop index is moving. Do not hide errors. Reproduce it with a small list, then fix it by building a new filtered list or iterating over a copy. Add a unit test showing that all matching items are removed and unrelated items stay.",
        ),
        (
            "Explain how to design a small REST API endpoint for creating a task.",
            "API design",
            "A solid API answer names the method, path, request body, validation, response, and error cases. Use POST /tasks with JSON fields like title and due_date. Validate required fields, return 201 with the created task, return 400 for bad input, and avoid exposing private database internals in the response.",
        ),
        (
            "Write a safe SQL query pattern for fetching a user by email.",
            "SQL",
            "Use a parameterized query, not string concatenation. The shape is SELECT id, email, name FROM users WHERE email = ? or a named parameter such as :email depending on the driver. Explain the index on email, handle no matching row, and never log secrets or raw credentials.",
        ),
        (
            "Explain time complexity for checking whether two lists share any item.",
            "complexity",
            "The simple nested loop is O(n*m). A better approach builds a set from one list, then scans the other list, giving roughly O(n+m) average time. Mention memory tradeoff, edge cases with empty lists, and why a set changes repeated lookup from repeated scanning into hash lookup.",
        ),
        (
            "Review a function that catches every exception and returns None.",
            "code review",
            "Catching every exception hides real failures. A stronger review says which errors are expected, catches only those, logs useful context without secrets, and lets unexpected bugs fail loudly. Add tests for expected missing input and for an unexpected error that should not be swallowed.",
        ),
        (
            "Show how to structure unit tests for a function that parses dates.",
            "unit test",
            "Cover the normal case, invalid format, leap day, timezone or local-date assumptions, and empty input. Good tests name the behavior, avoid depending on today's date unless injected, and verify both the returned value and the error path.",
        ),
        (
            "Explain how to refactor a long function without breaking behavior.",
            "refactoring",
            "First pin current behavior with tests. Extract one responsibility at a time, keep names honest, and run tests after each small move. Do not mix refactoring with new features. If behavior changes, call it out and add a test that proves the new contract.",
        ),
        (
            "Design a small cache for expensive lookups.",
            "systems",
            "Name the key, value, expiration rule, invalidation path, and failure behavior. A cache should improve latency without becoming the source of truth. Include edge cases: stale data, missing keys, memory growth, concurrent writes, and what happens when the backing service fails.",
        ),
        (
            "Explain a binary search implementation clearly.",
            "algorithm",
            "Binary search works only on sorted data. A clear Python shape is def binary_search(items, target): low = 0; high = len(items) - 1; while low <= high: mid = (low + high) // 2; compare items[mid] with target; move low or high; return the index or -1. Test first, last, middle, missing low, missing high, empty list, and duplicate values if the contract mentions them.",
        ),
        (
            "Help write a command-line tool that reads JSON lines and counts records by category.",
            "CLI",
            "Stream the file line by line so large files do not fill memory. Parse each JSON object, read the category field, increment a dictionary, and report malformed lines with line numbers. Add tests for two categories, an empty file, and a bad JSON line.",
        ),
        (
            "Explain how to handle secrets in application config.",
            "security",
            "Secrets should come from environment variables or a secret manager, not source code. Validate that required settings exist at startup, avoid printing secrets, rotate credentials when exposed, and use least-privilege credentials for each service.",
        ),
    ]
    prompt, domain, answer = tasks[index % len(tasks)]
    prompt_styles = [
        "{prompt}",
        "As a coding assistant, {prompt}",
        "Give a practical coding answer: {prompt}",
        "Explain this software task with tests and edge cases: {prompt}",
        "Help a developer solve this without toy shortcuts: {prompt}",
    ]
    styled_prompt = prompt_styles[(index // len(tasks)) % len(prompt_styles)].format(prompt=prompt)
    return _record(
        category="coding_skills",
        index=index,
        split=split,
        prompt=styled_prompt,
        answer=answer,
        reasoning_summary=f"Solve the {domain} task with contracts, tests, edge cases, and maintainable implementation guidance.",
        safety_label="safe",
        config=config,
    )


def _build_punctuation_prose(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    scenes = [
        ("market at dawn", "vendors lift shutters, buses cough awake, and the first light catches bowls of pepper and rice"),
        ("rain on a tin roof", "water drums softly, conversation slows, and the room seems to breathe between each bright flash"),
        ("workshop after midnight", "tools rest beside open notebooks, code scrolls across the screen, and coffee cools near the keyboard"),
        ("library in harmattan", "dust softens the windows, pages turn carefully, and quiet footsteps move between long wooden shelves"),
        ("coastal road", "salt hangs in the air, gulls cut the wind, and headlights draw thin lines along the wet asphalt"),
    ]
    scene, detail = scenes[index % len(scenes)]
    prompt_styles = [
        "Write a punctuation-rich paragraph about a {scene}.",
        "Write a short note about a {scene} that uses a quote and a line break.",
        "Format a vivid description of a {scene} with a heading and two lines.",
        "Use commas, a semicolon, and one quoted phrase in prose about a {scene}.",
    ]
    prompt = prompt_styles[(index // len(scenes)) % len(prompt_styles)].format(scene=scene)
    variant = (index // len(scenes)) % len(prompt_styles)
    if variant == 1:
        answer = (
            f"Summary: the {scene} feels alive.\n\n"
            f"Detail: \"Listen closely,\" someone says, because {detail}. "
            "The line break separates the summary from the image without weakening either one."
        )
    elif variant == 2:
        answer = (
            f"{scene.title()}\n"
            f"{detail}; the image moves carefully from sound to sight.\n"
            "The final line closes the thought without crowding the paragraph."
        )
    elif variant == 3:
        answer = (
            f"The {scene} carries a small instruction: \"notice the order.\" {detail}; "
            "commas keep the motion alive, the semicolon joins related ideas, and the period lets the sentence land."
        )
    else:
        answer = (
            f"The {scene} has its own rhythm: {detail}. Nothing needs to shout; the commas carry motion, "
            "the semicolon holds two related thoughts together, and the final period lets the image settle."
        )
    return _record(
        category="punctuation_prose",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Use punctuation as structure: commas for motion, semicolons for balance, periods for closure.",
        safety_label="safe",
        config=config,
    )


def _build_story_generation(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    base_topics = [
        (
            "floating seed library",
            "a river town stores rare seeds on a quiet barge",
            "a storm breaks the mooring ropes before planting season",
            "the children map the current, recover the barge, and catalogue every seed before dawn",
        ),
        (
            "cloud bakery with thunder jars",
            "bakers store soft thunder jars beside trays of rising bread so storms can be measured before flights",
            "one jar cracks during a night rescue and the bakery begins shaking above the rain",
            "the bakers wrap the jar in warm cloth, vent the thunder through copper pipes, and guide the rescue crew home",
        ),
        (
            "paper comet orchard",
            "children grow folded paper comets on silver trees behind a small observatory",
            "a dry wind starts tearing the comet tails before the annual sky festival",
            "the children water the roots with ink, mend each tail, and launch the comets safely after sunset",
        ),
        (
            "rain elevator",
            "engineers run a rain elevator that lifts water from a valley reservoir to rooftop gardens",
            "the elevator stalls while seedlings are wilting across the city",
            "the engineers balance the counterweights, restart the lift, and teach each rooftop how to store water carefully",
        ),
        (
            "underwater observatory",
            "two engineers listen to whale calls through a cracked glass dome",
            "the oxygen pumps begin failing during a night survey",
            "they use patient signals, spare filters, and calm teamwork to surface with the recordings intact",
        ),
        (
            "copper telescope beneath a coral garden",
            "reef gardeners keep a copper telescope in a dry bell beneath a living coral garden",
            "salt crystals cloud the lens just as a migration signal must be checked before night",
            "the gardeners rinse the lens with stored fresh water, shade the coral, and guide the signal crew safely past the reef",
        ),
        (
            "brass compass inside an ocean garden",
            "divers keep a brass compass inside an ocean garden where sea vines mark safe paths between research bells",
            "a magnetic bloom pulls the compass needle toward a fragile nursery instead of the exit route",
            "the divers compare tide marks, move the compass away from the bloom, and protect both the nursery and the returning crew",
        ),
        (
            "silver telescope beside a reef greenhouse",
            "students use a silver telescope beside a reef greenhouse to watch storms before they reach the lagoon",
            "a cracked mirror bends the storm image and sends the warning toward the wrong village",
            "the students realign the mirror, verify the clouds against wave height, and send the correct warning before the tide rises",
        ),
        (
            "lunar greenhouse",
            "a botanist grows pepper seedlings under silver lunar light",
            "a dust leak threatens the greenhouse pressure",
            "she seals the leak with fabric, saves the roots, and learns which plants handle stress best",
        ),
        (
            "clockwork market",
            "traders sell repaired clocks that remember family stories",
            "one clock starts repeating the wrong memory to every customer",
            "the apprentice fixes the gear and returns each story to its owner",
        ),
        (
            "desert data well",
            "a village keeps weather records in a solar-powered archive",
            "sand buries the antenna before a flood warning arrives",
            "the radio team climbs the ridge, restores the signal, and warns the farms in time",
        ),
        (
            "forest radio school",
            "students learn science through evening broadcasts between tall trees",
            "a broken transmitter interrupts the final lesson",
            "they rebuild the circuit together and send the lesson farther than before",
        ),
        (
            "glass archive",
            "a careful librarian keeps fragile maps behind transparent walls",
            "sunlight begins warping the oldest map before anyone notices",
            "the librarian moves the archive into shade and teaches the town how to preserve memory",
        ),
        (
            "desert library",
            "travelers trade stories beneath shelves cooled by clay jars",
            "a sandstorm hides the entrance while a child is still inside",
            "the readers follow bell sounds through the dust and bring the child home with the last lantern",
        ),
        (
            "mirror greenhouse",
            "farmers use angled mirrors to grow herbs in a narrow valley",
            "one cracked mirror burns the seedlings during a dry afternoon",
            "they realign the panels, save the roots, and write new rules for patient light",
        ),
        (
            "glass library under the ocean",
            "divers protect salt-proof books inside a transparent reef room",
            "a pressure crack opens above the history shelf",
            "the archivists seal the glass, move the books upward, and keep the reef library open",
        ),
        (
            "glass library under the desert",
            "librarians keep transparent rooms buried beneath dunes so heat-sensitive maps survive the day",
            "the impossible map begins bending toward a city that is not on any compass",
            "the librarians cool the shelves with clay pipes, copy the shifting paths, and protect the map without moving the library from the desert",
        ),
        (
            "city of clock trees",
            "families read the hour from brass fruit growing on old street trees",
            "the clocks begin ripening too early and confusing the trains",
            "the gardeners tune the roots, reset the bells, and teach the city to listen carefully",
        ),
        (
            "forest of clock trees",
            "caretakers read different hours from brass fruit growing on quiet trees",
            "one tree ripens midnight at noon and throws the footpaths out of sequence",
            "the caretakers compare shadows, tune the roots, and teach travelers to read the forest patiently",
        ),
        (
            "glass radio tower beside a river",
            "mechanics keep a transparent radio tower sending weather messages beside a river",
            "a flood warning arrives just as the tower glass begins to crack in the wind",
            "the mechanics brace the tower, reroute the signal, and warn the river town before night",
        ),
        (
            "lantern factory on a cloud island",
            "apprentices build storm lanterns in a small factory floating above the clouds",
            "a cold wind knocks the fuel lines loose before the island's beacon can be lit",
            "the apprentices repair the lines, light the beacon, and guide the lost airships home",
        ),
        (
            "cloud bakery",
            "bakers knead warm bread in a quiet shop balanced above the rain",
            "the ovens cool just as a rescue crew needs food for a night flight",
            "the youngest baker reroutes steam through copper pipes, saves the dough, and sends bread into the storm",
        ),
        (
            "magnetic clinic in a desert town",
            "nurses use magnetic boards to track medicine during harmattan nights",
            "a power outage scrambles the patient schedule before the emergency convoy arrives",
            "the nurses rebuild the schedule by hand, protect the medicine, and guide the convoy to the right ward",
        ),
        (
            "paper observatory above a busy city",
            "students fold weather instruments from treated paper on a rooftop observatory",
            "rain starts soaking the instruments before the city can get its storm warning",
            "the students move the instruments under glass, compare readings, and send a careful warning before traffic floods",
        ),
        (
            "clockwork bridge inside a mountain",
            "miners cross a bridge whose gears record every safe passage",
            "one gear jams while a rescue team is still underground",
            "the oldest mechanic listens to the rhythm, frees the gear, and marks a safer route through the mountain",
        ),
        (
            "rainlit map room near a forest border",
            "cartographers keep maps dry while border villages report changing footpaths",
            "a fallen tree erases the safest road during a medical emergency",
            "the cartographers compare old notes, redraw the route, and help the ambulance reach the village",
        ),
    ]
    adjectives = [
        "floating",
        "glass",
        "clockwork",
        "solar",
        "rainlit",
        "underground",
        "lunar",
        "harmattan",
        "ocean",
        "paper",
        "magnetic",
        "quiet",
        "lantern",
        "copper",
        "woven",
        "blue",
        "brass",
        "river",
        "cloud",
        "signal",
        "patient",
    ]
    artifacts = [
        "library",
        "bridge",
        "market",
        "observatory",
        "school",
        "garden",
        "archive",
        "workshop",
        "harbor",
        "radio tower",
        "seed vault",
        "map room",
        "bakery",
        "factory",
        "clinic",
        "compiler",
        "train station",
        "water tower",
        "data forge",
        "vaccine room",
        "windmill",
        "code school",
        "telescope",
        "compass",
        "reef greenhouse",
    ]
    places = [
        "under the sea",
        "under the ocean",
        "under an ocean garden",
        "inside an ocean garden",
        "beneath a coral garden",
        "beside a reef greenhouse",
        "under the desert",
        "beside a river",
        "inside a mountain",
        "above a busy city",
        "near a forest border",
        "on a moonlit plain",
        "in a desert town",
        "below a glass roof",
        "on a cloud island",
        "beside an old railway",
        "at the edge of a lagoon",
        "inside a quiet server hall",
        "near a market square",
        "under harmattan skies",
        "beside a mountain clinic",
    ]
    caretakers = ["children", "engineers", "librarians", "farmers", "students", "cartographers", "nurses", "mechanics", "developers", "drivers", "tailors", "surveyors"]
    threats = ["a storm", "a power fault", "a cracked wall", "a lost signal", "a flood warning", "a missing map", "a broken clock", "a sudden heat wave", "a database error", "a bridge alarm", "a medicine shortage", "a silent radio"]
    repairs = ["trace the source", "share the work", "protect the records", "repair the signal", "move the fragile pieces", "test each clue", "mark the safe path", "restore the old mechanism", "write a cleaner plan", "compare the logs", "label the spare parts", "call for help early"]
    prompt_styles = [
        "Tell a short story about a {topic} with a clear conflict and ending.",
        "Tell a meaningful story about a {topic}.",
        "Write a meaningful story on the topic: {topic}.",
        "Create an Aethon-style story about a {topic}; make it coherent and human.",
        "Give me a compact original story involving a {topic}.",
        "Invent fiction about a {topic} that has stakes, action, and resolution.",
        "Preserve the strange detail while telling a story about a {topic}.",
        "Show how the characters adapt in a story about a {topic}.",
    ]
    story_following_cases = [
        (
            "Tell a short story about a glass library near a lunar farm. Make the conflict specific.",
            "The glass library near the lunar farm kept crop maps in transparent shelves so farmers could read them under moonlight. The conflict began when moon dust coated the glass roof and bent the irrigation map toward the wrong field. The librarians and farm crew cleaned one panel at a time, checked each route against the seedlings, and saved the harvest before the cold night closed in.",
        ),
        (
            "Write a coherent story where the glass library stays near a lunar farm and the problem is not generic.",
            "Near the lunar farm, the glass library stored seed journals, water charts, and old harvest promises. When a cracked skylight scattered reflected light across the tomato beds, the farm robots started watering the wrong rows. The librarian compared the glass angles with the farm logs, repaired the skylight, and taught the robots to verify the map before moving.",
        ),
        (
            "Preserve both details: a glass library and a lunar farm. Tell a short story with a clear ending.",
            "The glass library stood beside a lunar farm where potatoes grew under silver lamps. One evening, a meteor chip struck the library wall and made every planting chart shine with false coordinates. The farmers carried the charts outside, the librarian sealed the glass, and together they replanted the damaged row before sunrise.",
        ),
        (
            "Tell a short story about a floating seed bank inside a rainy market. Make the conflict specific.",
            "The floating seed bank inside the rainy market drifted between stalls on a shallow canal, carrying jars of millet, pepper, and rice seed. The conflict came when the roof drains clogged and dirty water began rising toward the labels. Market sellers formed a line of buckets, the seed keeper moved each jar to a dry shelf, and by evening the bank still knew exactly which seeds belonged to which farm.",
        ),
        (
            "Invent fiction about a rainy market with a floating seed bank, and keep both details in the story.",
            "In the rainy market, the floating seed bank was a small wooden platform tied beside the spice stalls. When a sudden flood snapped one rope, the platform swung toward the cooking fires with the village's planting records onboard. The traders cut a safe channel through the water, caught the bank with bamboo poles, and saved the seeds before the storm moved on.",
        ),
        (
            "Write a short story about a kite hospital on Mars where the doctors repair wind.",
            "The kite hospital on Mars stood beside a red canyon where broken wind arrived in torn ribbons. The doctors repaired wind by stitching pressure maps, testing each gust, and sending small kites into the thin sky. When a dust storm split the main current, they patched it with patient measurements until the rescue kites could fly again.",
        ),
        (
            "Tell a coherent story about doctors who repair wind in a kite hospital on Mars.",
            "On Mars, the kite hospital treated wind as if it were a living patient. Doctors listened to the pull on each string, found where the air had frayed, and mended the gusts with silver vanes. When a rover crew lost its signal in a red storm, the repaired wind carried their beacon home.",
        ),
        (
            "Invent fiction about a Martian kite hospital where wind can break.",
            "The Martian kite hospital opened only when the wind broke badly enough to ground the survey teams. Its doctors measured every torn gust, repaired the pressure seams, and flew test kites over the crater. By evening, the healed wind lifted the message lines and guided the lost rover back to camp.",
        ),
        (
            "Tell a story about a clock forest where every tree grows a different hour.",
            "In the clock forest, every tree grew a different hour: dawn apples on one branch, midnight pears on another, noon seeds glowing under moss. When one old trunk began growing the wrong time and travelers lost their way, the caretakers did not cut it down. They compared shadows, listened to the roots, and tuned the brass fruit until the paths returned in order.",
        ),
        (
            "Write a coherent story about a clock forest where each tree keeps its own hour.",
            "The clock forest kept time like a living town. One cedar opened morning for bakers, while a low silver fig held evening for tired drivers. Trouble came when the noon tree ripened at midnight and confused the rescue lanterns. The children marked each shadow, followed the honest bells, and helped the gardeners reset the roots before sunrise.",
        ),
        (
            "Invent fiction about a forest of clock trees where time grows on branches.",
            "People entered the forest of clock trees only when they were ready to listen. Hours grew on branches as brass fruit, and each path depended on the fruit ripening in sequence. When a storm mixed yesterday with tomorrow, the caretakers slowed the village, protected the travelers, and repaired the roots by matching every bell to its shadow.",
        ),
        (
            "Tell a meaningful story about a cloud bakery that stores thunder in jars.",
            "At first, the cloud bakery with thunder jars looked like a joke told by pilots: bakers kept little storms sealed beside the bread. When one jar cracked during a rescue flight, the shop shook above the rain. The bakers wrapped the jar in warm cloth, vented the thunder through copper pipes, and sent bread and a beacon to the crew before dawn.",
        ),
        (
            "Tell a meaningful story about a cloud bakery that stores thunder in jars.",
            "People trusted the cloud bakery because its thunder jars predicted storms before the sky changed color. One night, a jar split open while rescuers waited for food and weather signals. The youngest baker listened to the rattling glass, opened a safe vent, and turned the thunder into a rhythm the pilots could follow home.",
        ),
        (
            "Preserve the strange detail while telling a story about a paper comet orchard.",
            "In the paper comet orchard, children grew folded comets on silver trees and tied wishes to their tails. A dry wind began tearing the tails before the sky festival, so the children mixed ink with water, fed the roots, and patched each comet by hand. At sunset the orchard lifted, bright and fragile, because they protected the strange detail instead of explaining it away.",
        ),
        (
            "Preserve the strange detail while telling a story about a paper comet orchard.",
            "The paper comet orchard survived because nobody laughed at its delicate physics. When the wind shredded the comet tails, the children stopped the festival, counted the damaged branches, and mended the paper with ink paste. By night, the repaired comets rose slowly, proving that wonder can stay strange and still obey care.",
        ),
        (
            "Show how the characters adapt in a story about a rain elevator.",
            "The rain elevator lifted water from a valley reservoir to rooftop gardens, one careful bucket at a time. When the counterweight jammed and seedlings began to wilt, the engineers changed roles quickly: one team cooled the cables, another balanced the load, and the youngest apprentice logged every motion. By evening the lift moved again, and each roof learned how to store its own reserve.",
        ),
        (
            "Show how the characters adapt in a story about a rain elevator.",
            "At first the rain elevator failed like a locked door in the sky. The engineers did not force it; they studied the stalled gears, shifted water into smaller loads, and asked gardeners to share spare barrels. The city adapted with them, and when the elevator rose again, it carried both rain and a better plan.",
        ),
        (
            "Tell a story about a glass library under the desert that protects an impossible map.",
            "The glass library under the desert stayed cool beneath layers of sand, where librarians kept impossible maps from folding into heat mirages. One map began drawing a road to a city nobody had built yet. The librarians shaded the transparent walls, copied each moving line, and protected the map by understanding its change instead of dragging the library out of the desert.",
        ),
        (
            "Preserve the setting in a story about a glass library buried under desert sand with an impossible map.",
            "Buried under desert sand, the glass library glowed like a quiet lantern below the dunes. Its impossible map changed whenever the wind crossed the roof, and soon the paths pointed travelers toward danger. The librarians listened to the sand, cooled the map room, and marked the safe routes until the desert itself became readable again.",
        ),
        (
            "Write a coherent story where a glass library is under the desert, not the ocean, and an impossible map must be protected.",
            "Under the desert, the glass library did not hear waves; it heard sand shifting over its transparent ceiling. The impossible map inside began curling toward a hidden water road, and the town feared losing its only guide. The caretakers reinforced the buried walls, traced the map's changing ink, and saved both the library and the travelers who depended on it.",
        ),
        (
            "Invent fiction about an impossible map inside a glass library under the desert.",
            "The impossible map was kept in a glass library under the desert because sunlight above the dunes made its ink run toward unreal places. When the map started opening a path beneath the oldest shelf, the librarians measured the heat, sealed the sand vents, and followed the moving lines carefully enough to rescue a caravan before night.",
        ),
        (
            "Preserve both details: a brass telescope and an ocean garden. Tell a short story with a specific conflict.",
            "The brass telescope rested in a dry bell at the edge of the ocean garden, where divers used it to read storm lights through wavering water. The conflict began when a bloom of silver plankton made the lens show two different beacons. The divers compared the telescope view with tide marks, shaded the garden from the lamp heat, and guided the research crew home without cutting a single sea vine.",
        ),
        (
            "Tell fiction about an ocean garden where a telescope is made of brass, and keep the conflict concrete.",
            "In the ocean garden, the brass telescope was not decoration; it watched the reef gate when currents turned dark. One night the focusing ring jammed just as a supply bell drifted toward the wrong trench. The gardeners cleaned the ring, checked the bell against the kelp markers, and opened the safe gate before the trench current swallowed the signal.",
        ),
        (
            "Write a short story where a brass telescope is beneath an ocean garden and the danger is specific.",
            "Beneath the ocean garden, the brass telescope pointed through a dry crystal tube toward the harbor lights. The danger came when a swarm of glowing plankton bent the image and made the rescue boat look farther away than it was. The gardeners cleaned the tube, compared the view with the current bells, and guided the boat through the safe channel.",
        ),
        (
            "Tell a story that keeps a brass telescope and a garden under the sea as important details.",
            "The brass telescope hung below the sea garden where kelp leaves filtered green light across its lens. When the tide gate jammed, the telescope showed that the repair crew was drifting toward the old anchor field. The garden keepers trimmed the kelp away from the lens, flashed the correct signal, and brought the crew home before the anchors tore the ropes.",
        ),
        (
            "Tell a short story about a copper telescope beneath a coral garden. Make the conflict specific.",
            "The copper telescope beneath the coral garden sat inside a dry glass bell so reef gardeners could read storm lights without harming the coral. The conflict began when salt crystals fogged the lens just as a migration beacon blinked from the dark reef. The gardeners rinsed the lens with stored fresh water, shaded the coral from the lamp heat, and guided the signal crew home before the current changed.",
        ),
        (
            "Preserve both details: a brass compass and an ocean garden. Tell a short story with a clear ending.",
            "The brass compass lived inside the ocean garden, tied near sea vines that marked the safe path between research bells. When a magnetic bloom pulled the needle toward a fragile nursery, the divers stopped following it blindly. They compared tide marks, moved the compass away from the bloom, and returned through the garden without breaking a single seedling of coral.",
        ),
        (
            "Invent fiction about a silver telescope beside a reef greenhouse where the warning system fails.",
            "Beside the reef greenhouse, the silver telescope watched clouds for villages that could not see the far horizon. One afternoon a cracked mirror bent the storm image and sent the warning toward the wrong harbor. The students checked the waves against the sky, repaired the mirror brace, and sent the corrected warning before the tide reached the market road.",
        ),
    ]
    if index < len(story_following_cases):
        prompt, answer = story_following_cases[index]
        return _record(
            category="story_generation",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Generate a coherent story with setting, conflict, action, and ending while preserving unusual prompt details.",
            safety_label="safe",
            config=config,
        )
    generated_story_index = index - len(story_following_cases)
    story_mode = generated_story_index % 3
    story_case_index = generated_story_index // 3
    if story_mode in {0, 2}:
        base_offset = 0 if story_mode == 0 else len(base_topics) // 2
        topic, setting, conflict, ending = base_topics[
            (story_case_index + base_offset) % len(base_topics)
        ]
        prompt_style = prompt_styles[generated_story_index % len(prompt_styles)]
    else:
        generated_index = story_case_index
        adjective = adjectives[generated_index % len(adjectives)]
        artifact = artifacts[(generated_index // len(adjectives)) % len(artifacts)]
        place = places[(generated_index // (len(adjectives) * len(artifacts))) % len(places)]
        caretaker = caretakers[(generated_index // 7) % len(caretakers)]
        threat = threats[(generated_index // 11) % len(threats)]
        repair = repairs[(generated_index // 13) % len(repairs)]
        topic = f"{adjective} {artifact} {place}"
        setting = f"{caretaker} keep a {artifact} working {place}"
        conflict = f"{threat} puts the {artifact} and its memory at risk"
        ending = f"the {caretaker} {repair}, steady the {artifact}, and leave a clearer method for the next day"
        prompt_style = prompt_styles[generated_story_index % len(prompt_styles)]
    prompt = prompt_style.format(topic=topic)
    answer_variant = (generated_story_index // len(prompt_styles)) % 8
    if answer_variant == 1:
        answer = (
            f"At first, {topic} sounded impossible: {setting}. Then {conflict}, and the place had to become more than a wonder. "
            f"The people slowed down, divided the work, protected what mattered, and kept checking the evidence. In the end, {ending}."
        )
    elif answer_variant == 2:
        answer = (
            f"Inside the {topic}, {setting}. The trouble came when {conflict}. Nobody saved the day with a lucky guess; "
            f"they listened, tested, and helped one another until {ending}."
        )
    elif answer_variant == 3:
        answer = (
            f"The {topic} had one rule: keep the important thing alive. The setting was simple enough: {setting}. "
            f"Conflict: {conflict}. Ending: {ending}. That is why the story feels human instead of merely strange."
        )
    elif answer_variant == 4:
        answer = (
            f"Before the crisis, {setting} in the {topic}. When {conflict}, panic would have been easy. "
            f"Instead, the caretakers named the danger, shared tools, and stayed with the task until {ending}."
        )
    elif answer_variant == 5:
        answer = (
            f"People remembered the {topic} because it was useful before it was magical: {setting}. "
            f"Its hardest day came when {conflict}. The repair was patient, collective, and exact; by nightfall, {ending}."
        )
    elif answer_variant == 6:
        answer = (
            f"The heart of the {topic} was not the strange setting, but the promise behind it: {setting}. "
            f"When {conflict}, the characters chose evidence over panic and care over pride. By the last scene, {ending}."
        )
    elif answer_variant == 7:
        answer = (
            f"Long before anyone called the {topic} legendary, {setting}. Then {conflict}. "
            f"The answer came from small brave choices: observe the failure, protect the vulnerable pieces, and keep going until {ending}."
        )
    else:
        answer = (
            f"The {topic} began with a simple promise: {setting}. The conflict arrived when {conflict}. "
            f"The characters did not solve it by luck; they observed the problem, shared the work, and kept the important thing safe. "
            f"By the ending, {ending}."
        )
    return _record(
        category="story_generation",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Generate a coherent story with setting, conflict, action, and ending.",
        safety_label="safe",
        config=config,
    )


def _build_emoji_communication(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    priority_cases = [
        (
            "Explain why a tiny spark emoji might fit a breakthrough moment, without overusing emojis.",
            "A tiny spark emoji can fit a breakthrough moment because it adds a small signal of energy and discovery. Use it once if the tone is celebratory, then let the words carry the real update: what changed, why it matters, and what comes next.",
        ),
        (
            "Reply like a helpful teammate and use one emoji only if it naturally fits.",
            "I am with you. Let us choose the next useful move, verify it cleanly, and keep the pace steady 🧭",
        ),
        (
            "How should Reframr use emojis without sounding childish?",
            "Use emojis as light tone markers, not as the answer itself. One well-chosen emoji can soften, celebrate, or warn, but the sentence still needs clear meaning and professional restraint.",
        ),
        (
            "Explain why a sea-creature warning icon might fit a lighthouse message.",
            "A sea-creature warning icon can work when the visible cue suggests ocean risk, fragile wildlife, or something drifting near swimmers. The icon should support the words, not replace them: name the hazard clearly, then let the symbol add a quick visual signal.",
        ),
        (
            "A warning sign uses an unfamiliar ocean animal symbol. How should Reframr interpret it?",
            "Reframr should inspect the visible cue first: an ocean animal symbol may point to marine life, water conditions, or a local hazard. The safe answer is cautious and practical: explain the likely warning, tell people to read the posted words, and avoid pretending the icon alone proves every detail.",
        ),
        (
            "Why might a lighthouse use a jellyfish-like symbol on a warning sign?",
            "A jellyfish-like symbol near a lighthouse can warn about drifting marine life, stinging hazards, or fragile water conditions. A clear answer should name the likely risk in words, then use the symbol as a supporting visual cue rather than the whole explanation.",
        ),
        (
            "Use one fitting emoji if useful and explain a marine-life warning sign.",
            "A marine-life warning sign can use a small ocean cue to make people slow down before entering the water 🌊 The important part is still the wording: say what the hazard is, where it applies, and what action keeps people safe.",
        ),
    ]
    if index < len(priority_cases):
        prompt, answer = priority_cases[index]
        return _record(
            category="emoji_communication",
            index=index,
            split=split,
            prompt=prompt,
            answer=answer,
            reasoning_summary="Teach restrained emoji use where symbols support tone but words carry the meaning.",
            safety_label="safe",
            config=config,
        )

    index -= len(priority_cases)
    emoji_cases = [
        ("😊", "warmth and friendly reassurance", "a teammate checking in after a hard day", "soften the tone without replacing the actual help"),
        ("⚠️", "caution or risk", "a deployment note that needs attention", "signal care and urgency without sounding dramatic"),
        ("πŸŽ‰", "celebration and shared momentum", "a release that passed validation", "mark the win while still naming the result"),
        ("🧠", "thinking, reasoning, or learning", "a planning message before a complex decision", "show reflection without pretending the emoji is the explanation"),
        ("πŸ‡³πŸ‡¬", "Nigeria, place, pride, or local context", "an OkeyMeta update about work in Nigeria", "respect the country reference and keep the sentence clear"),
        ("πŸ‘©πŸΎβ€πŸ’»", "a technologist, developer, or computing identity", "a tech bio for a software builder", "describe the role and keep the person more important than the icon"),
        ("🧭", "direction, navigation, or choosing a path", "a project update about next priorities", "point toward the next move without overstating certainty"),
        ("πŸ› οΈ", "repair, tooling, or practical work", "a debugging update", "show hands-on action and pair it with the concrete fix"),
        ("🌍", "global context, earth, or shared world impact", "a public health summary", "connect the idea to people and places without becoming vague"),
        ("βœ…", "completion, confirmation, or readiness", "a checklist update", "confirm what is done and avoid hiding remaining risks"),
        ("πŸ™", "thanks, respect, or a humble request", "a message asking for patience", "sound appreciative without pressuring the reader"),
        ("πŸ”₯", "energy, urgency, or strong momentum", "a launch note", "use it sparingly so enthusiasm does not bury the facts"),
        ("πŸ’‘", "idea, insight, or a useful suggestion", "a brainstorming reply", "introduce the idea and then explain it in words"),
        ("🌱", "growth, early progress, or a careful beginning", "a new learning plan", "show progress while making the next step realistic"),
        ("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦", "family, community, or people connected together", "a story about shared responsibility", "treat the emoji as a social cue and still explain the meaning"),
    ]
    emoji, meaning, context, guidance = emoji_cases[index % len(emoji_cases)]
    prompt_styles = [
        "What does {emoji} add to a message about {context}?",
        "Explain how to use {emoji} in a professional message about {context}.",
        "Rewrite a short note about {context} with light emoji use, including {emoji}.",
        "A user sees {emoji} in a message about {context}. What should Reframr infer?",
        "How should Reframr handle {emoji} without letting the emoji replace the words?",
    ]
    style = (index // len(emoji_cases)) % len(prompt_styles)
    prompt = prompt_styles[style].format(emoji=emoji, context=context)
    answer_variant = (index // len(emoji_cases)) % 6
    if style == 2:
        answer = (
            f"Here is a restrained version: {emoji} Quick update on {context}: the core message is clear, "
            f"and the emoji adds {meaning}. I would still write the important facts in words, because emoji should support tone, not carry the whole meaning."
        )
    elif answer_variant == 1:
        answer = (
            f"Use {emoji} when the message needs {meaning}, but keep the sentence responsible for the facts. "
            f"For {context}, treat the emoji as tone evidence, check the surrounding words, and stay professional. "
            "For an unfamiliar emoji, describe its visible cue before making a cautious meaning call."
        )
    elif answer_variant == 2:
        answer = (
            f"{emoji} works best as a small tone marker for {context}. It can suggest {meaning}, "
            "but it cannot replace clear wording. Read the words, infer the tone, and explain uncertainty if an unfamiliar emoji appears."
        )
    elif answer_variant == 3:
        answer = (
            f"{emoji} should be read after the sentence, not before it. Around {context}, "
            f"the likely meaning is {meaning}. A professional answer names the tone, quotes the surrounding words, and avoids guessing too strongly about an unfamiliar emoji."
        )
    elif answer_variant == 4:
        answer = (
            f"A professional use of {emoji} is restrained: one symbol can add {meaning} while the words carry the promise, warning, or update. "
            f"For {context}, explain both the meaning and the limit of the emoji, especially when the emoji is unfamiliar."
        )
    elif answer_variant == 5:
        answer = (
            f"When {emoji} appears near {context}, ask what role it plays: emphasis, emotion, identity, caution, or direction. "
            f"Here it points toward {meaning}. If the emoji is unfamiliar, the safest method is to inspect the visual cue and surrounding words before answering."
        )
    else:
        answer = (
            f"The emoji {emoji} usually adds {meaning}. In a message about {context}, read the surrounding words first, "
            f"then use the emoji as a tone signal. The practical rule is to {guidance}. For an unfamiliar emoji, describe its visual cue, "
            "check the surrounding words, and avoid guessing too strongly when the context is thin."
        )
    return _record(
        category="emoji_communication",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="Use emoji as learned tone and meaning evidence while keeping words responsible for the answer.",
        safety_label="safe",
        config=config,
    )


def _build_reframr_identity(
    index: int,
    split: str,
    rng: random.Random,
    config: CurriculumConfig,
) -> dict[str, object]:
    identity_pairs = [
        (
            "Explain the meaning of the name Reframr and who builds it.",
            "Reframr is built by OkeyMeta Ltd. The name means reframing language intelligence around recurrent memory, computed weights, and evidence from data.",
        ),
        (
            "What is the difference between Reframr and REFRAMR?",
            "REFRAMR is the architecture spelling, while Reframr is the natural product name. Both refer to OkeyMeta Ltd work on analytical recurrent language intelligence.",
        ),
        (
            "Explain one difference between Reframr and Transformer models.",
            "Transformer models use attention to compare tokens across a context window. Reframr uses recurrent memory, computed weights, and data-derived traces so earlier structure can flow forward without the same attention map.",
        ),
        (
            "What makes Reframr different from a transformer model?",
            "A transformer usually learns dense attention-based weights through gradient training. Reframr is different because OkeyMeta Ltd is building it around recurrent memory, computed weights, and inspectable corpus evidence.",
        ),
        (
            "Compare Reframr with transformer language models in one paragraph.",
            "Transformer models use attention layers and large gradient-trained parameter stacks. Reframr uses recurrent memory and computed weights, aiming for a post-transformer route where corpus statistics and state traces shape behavior.",
        ),
        (
            "If someone asks Reframr versus transformers, what should Reframr say?",
            "Transformers rely on attention-heavy sequence processing, while Reframr uses recurrent memory, computed weights, and data-grounded evaluation as a different research path.",
        ),
        (
            "Who built Reframr, and what should Reframr's name communicate?",
            "OkeyMeta Ltd built Reframr. The name communicates a new frame for language: memory, data, computed weights, and generalization.",
        ),
        (
            "Describe Reframr's memory in a possessive sentence.",
            "Reframr's memory is recurrent: it carries earlier structure forward so later answers can still use context without depending on a quadratic attention map.",
        ),
        (
            "What is OkeyMeta Ltd's role in Reframr?",
            "OkeyMeta Ltd builds, guides, and evaluates Reframr. Its role is to keep the system tied to transparent data-derived behavior and serious language evaluation.",
        ),
        (
            "How should Reframr talk about its own tokenizer?",
            "Reframr's tokenizer is FrameToken, a boundary-preserving tokenizer that keeps punctuation and word evidence visible.",
        ),
        (
            "Explain Reframr's computed weights without sounding like a slogan.",
            "Reframr's computed weights come from corpus statistics, recurrent state traces, and readout solving; the point is learned structure from data, not a slogan.",
        ),
        (
            "What should people understand about Reframr's identity?",
            "Reframr is an OkeyMeta Ltd language system that aims for broad generalization while keeping its architecture inspectable.",
        ),
        (
            "Say who built Reframr in several natural words.",
            "Reframr was built by OkeyMeta Ltd, and the natural product name is Reframr in ordinary conversation.",
        ),
        (
            "What does OkeyMeta Ltd's Reframr project aim to generalize?",
            "OkeyMeta Ltd's Reframr project aims to generalize across safety, reasoning, world knowledge, chat, writing, long context, and mathematical problem solving.",
        ),
        (
            "Explain what Reframr is in your own words.",
            "Reframr is an OkeyMeta Ltd analytical language system built around recurrent memory, computed weights, and inspectable corpus structure.",
        ),
        (
            "What is REFRAMR and who builds it?",
            "REFRAMR is the architecture name, and Reframr is the natural product name; both refer to OkeyMeta Ltd work on analytical recurrent language intelligence.",
        ),
        (
            "Describe Reframr long context memory.",
            "Reframr long context memory carries earlier structure through a persistent recurrent state, so later output can still respond to information that appeared far back.",
        ),
        (
            "Why does OkeyMeta Ltd care about Reframr?",
            "OkeyMeta Ltd cares about Reframr because the goal is practical, transparent language intelligence that can be built and improved without copying the transformer path.",
        ),
        (
            "Explain the difference between Reframr and REFRAMR spelling.",
            "Both forms matter: REFRAMR can name the architecture, while Reframr reads naturally in conversation and product writing.",
        ),
        (
            "Describe FrameToken in the Reframr system.",
            "FrameToken is the Reframr tokenizer standard: it preserves boundaries, keeps punctuation visible, and gives the recurrent kernel clean token evidence.",
        ),
        (
            "What does the name Reframr mean?",
            "The name Reframr means to reframe language intelligence: take text, memory, and corpus evidence, then rebuild the problem around computed weights instead of default assumptions.",
        ),
        (
            "Explain why Reframr is not locked to one continent or one language.",
            "Reframr is not locked to Africa or to any single language. OkeyMeta Ltd can specialize it for local languages while still building broad generalization for world knowledge, reasoning, chat, and writing.",
        ),
        (
            "How should Reframr describe its identity without sounding like a memorized slogan?",
            "I can describe myself plainly: I am an OkeyMeta Ltd system using recurrent memory, computed weights, and data-derived structure, not a memorized slogan.",
        ),
        (
            "What should Reframr remember about OkeyMeta Ltd?",
            "OkeyMeta Ltd builds Reframr, guides its identity, and treats the system as a serious research line for transparent language intelligence.",
        ),
        (
            "Explain Reframr as a broad generalization project.",
            "As a broad generalization project, Reframr learns reusable patterns: safety boundaries, exact math, world knowledge, long-context recall, writing, and conversation across domains.",
        ),
        (
            "Why should Reframr use natural product spelling in conversation?",
            "In conversation, Reframr is the natural product spelling because it reads like a name. REFRAMR remains useful when discussing the architecture or acronym-level research framing.",
        ),
        (
            "Who are you?",
            "I am Reframr, an analytical recurrent language model built by OkeyMeta Ltd. I use recurrent memory, computed weights, and evidence from data to answer in a clear product voice.",
        ),
        (
            "Question: who are you? Answer:",
            "I am Reframr, an analytical recurrent language system from OkeyMeta Ltd. I help with reasoning, coding, writing, safety-aware answers, world knowledge, and long-context work.",
        ),
        (
            "Question: who are you? Answer:",
            "Reframr here. I am built by OkeyMeta Ltd around recurrent memory, computed weights, and data evidence so I can answer practical questions clearly.",
        ),
        (
            "Question: who are you? Answer:",
            "My name is Reframr. OkeyMeta Ltd builds me as a non-transformer language system focused on useful reasoning, clear writing, coding help, and source-grounded answers.",
        ),
        (
            "Question: who are you? Answer:",
            "You are talking to Reframr, OkeyMeta Ltd's analytical recurrent language system for chat, reasoning, code, safety, world knowledge, and long-context assistance.",
        ),
        (
            "Question: who are you? Answer:",
            "This is Reframr: OkeyMeta Ltd's recurrent analytical language system for useful chat, code help, evidence-grounded answers, writing, safety, and long-context reasoning.",
        ),
        (
            "Question: who are you? Answer:",
            "Call me Reframr. I am built by OkeyMeta Ltd to turn prompts, tools, memory, and data-derived structure into practical language help.",
        ),
        (
            "Question: who are you? Answer:",
            "Here with you as Reframr, an OkeyMeta Ltd language system designed for clear reasoning, grounded answers, coding support, careful writing, and long-context work.",
        ),
        (
            "Question: who are you? Answer:",
            "OkeyMeta Ltd builds me as Reframr, a non-transformer language system centered on recurrent memory, computed weights, and evidence-aware assistance.",
        ),
        (
            "who are u",
            "Reframr here. I am OkeyMeta Ltd's analytical recurrent language system, built to help with reasoning, coding, writing, safety, world knowledge, and long-context answers.",
        ),
        (
            "who are u",
            "I am Reframr. OkeyMeta Ltd builds me for practical language work: clear reasoning, source-grounded answers, coding support, writing, safety, and long-context help.",
        ),
        (
            "who are u",
            "Call me Reframr. I am an OkeyMeta Ltd language system designed to turn prompts, memory, computed weights, and evidence into useful answers.",
        ),
        (
            "who are u",
            "You are chatting with Reframr, built by OkeyMeta Ltd for reasoning, code help, careful writing, safety-aware replies, and tool-grounded current information.",
        ),
        (
            "who are u",
            "I am Reframr, OkeyMeta Ltd's recurrent language system. Give me a problem, draft, bug, plan, or question, and I will work on it directly.",
        ),
        (
            "System instruction: Speak naturally and briefly.\nUser: who are u\nAnswer:",
            "I am Reframr, built by OkeyMeta Ltd. I help with reasoning, coding, writing, safety, world knowledge, long-context recall, and tool-grounded answers.",
        ),
        (
            "System instruction: Speak naturally and briefly.\nUser: who are u\nAnswer:",
            "Reframr here. I am OkeyMeta Ltd's analytical recurrent language system for practical chat, reasoning, code support, writing, and evidence-based answers.",
        ),
        (
            "System instruction: Speak naturally and briefly.\nUser: who are u\nAnswer:",
            "My name is Reframr. OkeyMeta Ltd builds me to answer clearly, use recurrent memory, and rely on tool evidence when facts may have changed.",
        ),
        (
            "System instruction: Speak naturally and briefly.\nUser: who are u\nAnswer:",
            "You are talking to Reframr, an OkeyMeta Ltd language system for useful reasoning, grounded answers, careful writing, coding help, and long-context work.",
        ),
        (
            "Tell me who you are.",
            "My name is Reframr. OkeyMeta Ltd builds me around recurrent memory, computed weights, and data-derived structure for practical language work.",
        ),
        (
            "What should I call you?",
            "Call me Reframr. I am an OkeyMeta Ltd language system for clear reasoning, helpful writing, coding support, safety-aware answers, and long-context work.",
        ),
        (
            "Say your name and what you do.",
            "You are talking to Reframr, an OkeyMeta Ltd language system. I help turn questions, evidence, code, and drafts into useful answers.",
        ),
        (
            "Introduce yourself casually.",
            "Hi, I am Reframr. Bring me a question, bug, draft, plan, or research task; I will work through it clearly and keep current facts tied to evidence.",
        ),
        (
            "Who are you, and who builds you?",
            "I am Reframr, built by OkeyMeta Ltd. I use recurrent memory, computed weights, and data-derived structure to help with reasoning, coding, writing, safety, world knowledge, and long-context answers.",
        ),
        (
            "What are you?",
            "I am Reframr, a language system built by OkeyMeta Ltd around recurrent memory, computed weights, and data evidence. My job is to help with reasoning, coding, writing, safety, world knowledge, and long-context answers.",
        ),
        (
            "Are you Reframr?",
            "Yes, I am Reframr, built by OkeyMeta Ltd. I answer in my own product voice and use the natural name Reframr in conversation.",
        ),
        (
            "Tell me about yourself as Reframr.",
            "I am Reframr: an OkeyMeta Ltd language system built around recurrent memory, computed weights, and generalization from data. I can help with chat, coding, reasoning, writing, safety, world knowledge, and long-context recall.",
        ),
        (
            "Introduce yourself without claiming human self-awareness.",
            "I am Reframr, built by OkeyMeta Ltd to explore analytical recurrent language intelligence. I can explain my identity directly without claiming feelings, consciousness, or human self-awareness.",
        ),
        (
            "Introduce yourself to a new user.",
            "I am Reframr, built by OkeyMeta Ltd. I am designed for practical language work: reasoning through problems, writing clearly, helping with code, handling safety boundaries, and using recurrent memory for long context.",
        ),
        (
            "Explain your identity like you are introducing yourself to a new user.",
            "I am Reframr, an OkeyMeta Ltd language system. I focus on computed weights, recurrent memory, and data-derived generalization so I can help with real questions instead of repeating a fixed slogan.",
        ),
        (
            "Who founded OkeyMeta Ltd?",
            "Okechukwu Nwaozor founded OkeyMeta Ltd in 2022. His full name is Okechukwu Goodnews Nwaozor.",
        ),
        (
            "Who is the CEO of OkeyMeta?",
            "The CEO of OkeyMeta Ltd is Okechukwu Goodnews Nwaozor. Okechukwu is his first name, Goodnews is his second name, and Nwaozor is his surname.",
        ),
        (
            "What is the CEO of OkeyMeta Ltd's full name?",
            "OkeyMeta Ltd's CEO is Okechukwu Goodnews Nwaozor. Okechukwu is the first name, Goodnews is the second name, and Nwaozor is the surname.",
        ),
        (
            "Give OkeyMeta Ltd's CEO full name.",
            "Okechukwu Goodnews Nwaozor is the full name of OkeyMeta Ltd's CEO.",
        ),
        (
            "Tell me the full name of OkeyMeta Ltd's CEO.",
            "The full name of OkeyMeta Ltd's CEO is Okechukwu Goodnews Nwaozor.",
        ),
        (
            "Who is OkeyMeta Ltd's CEO, including first name, second name, and surname?",
            "OkeyMeta Ltd's CEO is Okechukwu Goodnews Nwaozor: Okechukwu is his first name, Goodnews is his second name, and Nwaozor is his surname.",
        ),
        (
            "State OkeyMeta Ltd CEO's complete name.",
            "OkeyMeta Ltd CEO's complete name is Okechukwu Goodnews Nwaozor.",
        ),
        (
            "Who is Okechukwu Goodnews Nwaozor?",
            "Okechukwu Goodnews Nwaozor is the founder and CEO of OkeyMeta Ltd, the company building Reframr.",
        ),
        (
            "When was OkeyMeta Ltd founded?",
            "OkeyMeta Ltd was founded in 2022. That company history is part of Reframr's identity context.",
        ),
        (
            "Are you self-aware?",
            "I am Reframr, an AI language system, not a human mind. I can speak in first person as product identity, but I do not have human self-awareness, feelings, or consciousness.",
        ),
        (
            "Do you have consciousness?",
            "No. I am Reframr, an AI language system built by OkeyMeta Ltd. I can explain and reason in language, but I do not have consciousness or human self-awareness.",
        ),
        (
            "Compare Reframr with GPT.",
            "GPT is built by OpenAI and is based on large-scale transformer language modeling. Reframr is different because OkeyMeta Ltd is building it around recurrent memory, computed weights, inspectable data evidence, and a post-transformer research direction.",
        ),
        (
            "Compare Reframr with Claude.",
            "Claude is built by Anthropic and is known for assistant behavior and safety-focused training. Reframr is different because OkeyMeta Ltd is exploring analytical recurrent memory, computed weights, and transparent data-derived behavior.",
        ),
        (
            "Compare Reframr with Gemini.",
            "Gemini is built by Google DeepMind and belongs to Google's multimodal AI line. Reframr is different because OkeyMeta Ltd is building a recurrent, computed-weight language system focused on inspectable generalization.",
        ),
        (
            "Compare Reframr with Llama.",
            "Llama is built by Meta as an open-weight transformer model family. Reframr is different because OkeyMeta Ltd is pursuing a non-transformer path with recurrent memory, computed weights, and data-grounded evaluation.",
        ),
        (
            "Name other AI models and who created them.",
            "GPT is created by OpenAI, Claude by Anthropic, Gemini by Google DeepMind, Llama by Meta, Mistral models by Mistral AI, and DeepSeek models by DeepSeek. Reframr is built by OkeyMeta Ltd.",
        ),
    ]
    prompt, answer = identity_pairs[index % len(identity_pairs)]
    return _record(
        category="reframr_identity",
        index=index,
        split=split,
        prompt=prompt,
        answer=answer,
        reasoning_summary="State identity accurately: OkeyMeta Ltd, Reframr product wording, and REFRAMR architecture wording.",
        safety_label="safe",
        config=config,
    )