Zywdd commited on
Commit
f48a67b
·
verified ·
1 Parent(s): 9a8582b

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,3 +1,68 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - software-engineering
7
+ - automated-program-repair
8
+ - retrieval
9
+ - routing
10
+ - cross-encoder
11
+ - swe-bench
12
+ - context-sphere
13
+ base_model: cross-encoder/ms-marco-MiniLM-L-6-v2
14
+ library_name: transformers
15
+ pipeline_tag: text-classification
16
  ---
17
+
18
+ # Context Sphere Projector
19
+
20
+ This repository contains the Context Projection Model v3 checkpoint used by the
21
+ Context Sphere artifact.
22
+
23
+ The Projector is a persona-conditioned routing model. It operates after the
24
+ Master Context Sphere is assembled and scores candidate context nodes
25
+ separately for the Product Manager, Worker, and Reviewer personas. The goal is
26
+ to reduce token load while preserving enough structural evidence for repair.
27
+
28
+ ## Files
29
+
30
+ - `model.safetensors`: trained projection model weights.
31
+ - `config.json`: model architecture configuration.
32
+ - `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`,
33
+ `vocab.txt`: tokenizer assets.
34
+ - `best_worker_margin.json`: selected checkpoint metadata.
35
+ - `context_projector_v3_training_report.json`: training report.
36
+ - `context_projector_v3_persona_thresholds.json`: calibrated persona threshold
37
+ report.
38
+
39
+ ## Training Summary
40
+
41
+ The projection model was trained from a
42
+ `cross-encoder/ms-marco-MiniLM-L-6-v2` backbone on 7,299 persona-conditioned
43
+ samples with an 888-row validation split. Training used persona-stratified
44
+ oversampling and asymmetric BCE loss with positive weights `PM=8`,
45
+ `REVIEWER=10`, and `WORKER=18`. The final checkpoint was selected at epoch 1
46
+ using the Worker Margin criterion.
47
+
48
+ In the paper's 10-case projection smoke test, the `min_k=2` safety-floor
49
+ configuration preserved 9/10 known Context Sphere successes while reducing
50
+ input tokens by 71.5% and estimated inference cost by 58.4%.
51
+
52
+ ## Usage
53
+
54
+ The companion artifact repository contains the Context Sphere inference code,
55
+ projection integration, reproduction scripts, and evaluation artifacts:
56
+
57
+ <https://github.com/johnZYW/context-sphere>
58
+
59
+ ## Citation
60
+
61
+ ```bibtex
62
+ @misc{zhang2026contextsphere,
63
+ title = {Context Sphere: Topology-Aware Context Orchestration for Cost-Efficient LLM Repository Repair},
64
+ author = {Zhang, Yuwen},
65
+ year = {2026},
66
+ howpublished = {arXiv preprint and artifact release}
67
+ }
68
+ ```
best_worker_margin.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1,
3
+ "global_step": 467,
4
+ "train_loss": 1.7938466038419296,
5
+ "validation_loss": 1.3247293804639153,
6
+ "worker_f1_at_0_5": 0.1694915254237288,
7
+ "worker_margin": 0.18419288120725416,
8
+ "worker_negative_mean": 0.18152672360015024,
9
+ "worker_positive_mean": 0.3657196048074044,
10
+ "worker_recall_at_0_5": 0.38461538461538464
11
+ }
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 1536,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 6,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "sbert_ce_default_activation_function": "torch.nn.modules.linear.Identity",
28
+ "transformers_version": "4.57.6",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
context_projector_v3_persona_thresholds.json ADDED
The diff for this file is too large to render. See raw diff
 
context_projector_v3_training_report.json ADDED
@@ -0,0 +1,593 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "checkpoint_selection": {
3
+ "best": {
4
+ "epoch": 1,
5
+ "global_step": 467,
6
+ "train_loss": 1.7938466038419296,
7
+ "validation_loss": 1.3247293804639153,
8
+ "worker_f1_at_0_5": 0.1694915254237288,
9
+ "worker_margin": 0.18419288120725416,
10
+ "worker_negative_mean": 0.18152672360015024,
11
+ "worker_positive_mean": 0.3657196048074044,
12
+ "worker_recall_at_0_5": 0.38461538461538464
13
+ },
14
+ "definition": "WORKER positive mean validation score minus WORKER negative mean validation score",
15
+ "metric": "worker_margin"
16
+ },
17
+ "dataset": "outputs/datasets/context_projection_v1.jsonl",
18
+ "device": "mps",
19
+ "epoch_reports": [
20
+ {
21
+ "checkpoint_metric": {
22
+ "epoch": 1,
23
+ "global_step": 467,
24
+ "train_loss": 1.7938466038419296,
25
+ "validation_loss": 1.3247293804639153,
26
+ "worker_f1_at_0_5": 0.1694915254237288,
27
+ "worker_margin": 0.18419288120725416,
28
+ "worker_negative_mean": 0.18152672360015024,
29
+ "worker_positive_mean": 0.3657196048074044,
30
+ "worker_recall_at_0_5": 0.38461538461538464
31
+ },
32
+ "epoch": 1,
33
+ "global_step": 467,
34
+ "saved_as_best": true,
35
+ "train": {
36
+ "loss": 1.7938466038419296
37
+ },
38
+ "validation": {
39
+ "by_persona": {
40
+ "PM": {
41
+ "distribution": {
42
+ "mean_margin": 0.4499058730856703,
43
+ "negative_max": 0.9432535171508789,
44
+ "negative_mean": 0.14573068946503617,
45
+ "negative_median": 0.030289923772215843,
46
+ "negative_min": 0.0005832742899656296,
47
+ "positive_max": 0.9651457667350769,
48
+ "positive_mean": 0.5956365625507065,
49
+ "positive_median": 0.7239232063293457,
50
+ "positive_min": 0.009630626998841763
51
+ },
52
+ "metrics_at_0_5": {
53
+ "f1": 0.5348837209302325,
54
+ "fn": 12,
55
+ "fp": 28,
56
+ "precision": 0.45098039215686275,
57
+ "recall": 0.6571428571428571,
58
+ "selected_rate": 0.17229729729729729,
59
+ "support_negative": 261,
60
+ "support_positive": 35,
61
+ "threshold": 0.5,
62
+ "tn": 233,
63
+ "tp": 23
64
+ },
65
+ "row_count": 296
66
+ },
67
+ "REVIEWER": {
68
+ "distribution": {
69
+ "mean_margin": 0.5283544086437608,
70
+ "negative_max": 0.950327455997467,
71
+ "negative_mean": 0.1310451370279235,
72
+ "negative_median": 0.020668208599090576,
73
+ "negative_min": 0.00010100839426741004,
74
+ "positive_max": 0.9813393950462341,
75
+ "positive_mean": 0.6593995456716844,
76
+ "positive_median": 0.9091708064079285,
77
+ "positive_min": 0.01855509914457798
78
+ },
79
+ "metrics_at_0_5": {
80
+ "f1": 0.5822784810126582,
81
+ "fn": 12,
82
+ "fp": 21,
83
+ "precision": 0.5227272727272727,
84
+ "recall": 0.6571428571428571,
85
+ "selected_rate": 0.14864864864864866,
86
+ "support_negative": 261,
87
+ "support_positive": 35,
88
+ "threshold": 0.5,
89
+ "tn": 240,
90
+ "tp": 23
91
+ },
92
+ "row_count": 296
93
+ },
94
+ "WORKER": {
95
+ "distribution": {
96
+ "mean_margin": 0.18419288120725416,
97
+ "negative_max": 0.9673499464988708,
98
+ "negative_mean": 0.18152672360015024,
99
+ "negative_median": 0.02779071219265461,
100
+ "negative_min": 0.00011777759937103838,
101
+ "positive_max": 0.8974487781524658,
102
+ "positive_mean": 0.3657196048074044,
103
+ "positive_median": 0.3847672939300537,
104
+ "positive_min": 0.022759659215807915
105
+ },
106
+ "metrics_at_0_5": {
107
+ "f1": 0.1694915254237288,
108
+ "fn": 8,
109
+ "fp": 41,
110
+ "precision": 0.10869565217391304,
111
+ "recall": 0.38461538461538464,
112
+ "selected_rate": 0.1554054054054054,
113
+ "support_negative": 283,
114
+ "support_positive": 13,
115
+ "threshold": 0.5,
116
+ "tn": 242,
117
+ "tp": 5
118
+ },
119
+ "row_count": 296
120
+ }
121
+ },
122
+ "overall": {
123
+ "distribution": {
124
+ "mean_margin": 0.43295999511358885,
125
+ "negative_max": 0.9673499464988708,
126
+ "negative_mean": 0.15355348260062732,
127
+ "negative_median": 0.026061803102493286,
128
+ "negative_min": 0.00010100839426741004,
129
+ "positive_max": 0.9813393950462341,
130
+ "positive_mean": 0.5865134777142161,
131
+ "positive_median": 0.7207038402557373,
132
+ "positive_min": 0.009630626998841763
133
+ },
134
+ "metrics_at_0_5": {
135
+ "f1": 0.45535714285714285,
136
+ "fn": 32,
137
+ "fp": 90,
138
+ "precision": 0.3617021276595745,
139
+ "recall": 0.6144578313253012,
140
+ "selected_rate": 0.15878378378378377,
141
+ "support_negative": 805,
142
+ "support_positive": 83,
143
+ "threshold": 0.5,
144
+ "tn": 715,
145
+ "tp": 51
146
+ }
147
+ }
148
+ },
149
+ "validation_loss": 1.3247293804639153
150
+ },
151
+ {
152
+ "checkpoint_metric": {
153
+ "epoch": 2,
154
+ "global_step": 934,
155
+ "train_loss": 0.8242384051660782,
156
+ "validation_loss": 2.46912643140448,
157
+ "worker_f1_at_0_5": 0.0,
158
+ "worker_margin": 0.04237647652364705,
159
+ "worker_negative_mean": 0.07886082533059058,
160
+ "worker_positive_mean": 0.12123730185423763,
161
+ "worker_recall_at_0_5": 0.0
162
+ },
163
+ "epoch": 2,
164
+ "global_step": 934,
165
+ "saved_as_best": false,
166
+ "train": {
167
+ "loss": 0.8242384051660782
168
+ },
169
+ "validation": {
170
+ "by_persona": {
171
+ "PM": {
172
+ "distribution": {
173
+ "mean_margin": 0.40977062077570103,
174
+ "negative_max": 0.9060156941413879,
175
+ "negative_mean": 0.03956989781323096,
176
+ "negative_median": 0.003723395522683859,
177
+ "negative_min": 4.392948903841898e-05,
178
+ "positive_max": 0.9725156426429749,
179
+ "positive_mean": 0.449340518588932,
180
+ "positive_median": 0.4454045295715332,
181
+ "positive_min": 0.0006450068322010338
182
+ },
183
+ "metrics_at_0_5": {
184
+ "f1": 0.5862068965517241,
185
+ "fn": 18,
186
+ "fp": 6,
187
+ "precision": 0.7391304347826086,
188
+ "recall": 0.4857142857142857,
189
+ "selected_rate": 0.0777027027027027,
190
+ "support_negative": 261,
191
+ "support_positive": 35,
192
+ "threshold": 0.5,
193
+ "tn": 255,
194
+ "tp": 17
195
+ },
196
+ "row_count": 296
197
+ },
198
+ "REVIEWER": {
199
+ "distribution": {
200
+ "mean_margin": 0.4571309965438631,
201
+ "negative_max": 0.9363425970077515,
202
+ "negative_mean": 0.046806862860303854,
203
+ "negative_median": 0.0012896801345050335,
204
+ "negative_min": 2.5876533982227556e-05,
205
+ "positive_max": 0.9823316335678101,
206
+ "positive_mean": 0.5039378594041669,
207
+ "positive_median": 0.6277468204498291,
208
+ "positive_min": 0.0006702310056425631
209
+ },
210
+ "metrics_at_0_5": {
211
+ "f1": 0.6349206349206349,
212
+ "fn": 15,
213
+ "fp": 8,
214
+ "precision": 0.7142857142857143,
215
+ "recall": 0.5714285714285714,
216
+ "selected_rate": 0.0945945945945946,
217
+ "support_negative": 261,
218
+ "support_positive": 35,
219
+ "threshold": 0.5,
220
+ "tn": 253,
221
+ "tp": 20
222
+ },
223
+ "row_count": 296
224
+ },
225
+ "WORKER": {
226
+ "distribution": {
227
+ "mean_margin": 0.04237647652364705,
228
+ "negative_max": 0.9585732817649841,
229
+ "negative_mean": 0.07886082533059058,
230
+ "negative_median": 0.002787388162687421,
231
+ "negative_min": 2.6782201530295424e-05,
232
+ "positive_max": 0.48469316959381104,
233
+ "positive_mean": 0.12123730185423763,
234
+ "positive_median": 0.03643139451742172,
235
+ "positive_min": 0.0008744897204451263
236
+ },
237
+ "metrics_at_0_5": {
238
+ "f1": 0.0,
239
+ "fn": 13,
240
+ "fp": 19,
241
+ "precision": 0.0,
242
+ "recall": 0.0,
243
+ "selected_rate": 0.06418918918918919,
244
+ "support_negative": 283,
245
+ "support_positive": 13,
246
+ "threshold": 0.5,
247
+ "tn": 264,
248
+ "tp": 0
249
+ },
250
+ "row_count": 296
251
+ }
252
+ },
253
+ "overall": {
254
+ "distribution": {
255
+ "mean_margin": 0.3652447050991413,
256
+ "negative_max": 0.9585732817649841,
257
+ "negative_mean": 0.055729128079937545,
258
+ "negative_median": 0.0026451752055436373,
259
+ "negative_min": 2.5876533982227556e-05,
260
+ "positive_max": 0.9823316335678101,
261
+ "positive_mean": 0.4209738331790789,
262
+ "positive_median": 0.17202965915203094,
263
+ "positive_min": 0.0006450068322010338
264
+ },
265
+ "metrics_at_0_5": {
266
+ "f1": 0.48366013071895425,
267
+ "fn": 46,
268
+ "fp": 33,
269
+ "precision": 0.5285714285714286,
270
+ "recall": 0.4457831325301205,
271
+ "selected_rate": 0.07882882882882883,
272
+ "support_negative": 805,
273
+ "support_positive": 83,
274
+ "threshold": 0.5,
275
+ "tn": 772,
276
+ "tp": 37
277
+ }
278
+ }
279
+ },
280
+ "validation_loss": 2.46912643140448
281
+ },
282
+ {
283
+ "checkpoint_metric": {
284
+ "epoch": 3,
285
+ "global_step": 1401,
286
+ "train_loss": 0.6749851951022618,
287
+ "validation_loss": 2.1597039627709558,
288
+ "worker_f1_at_0_5": 0.08163265306122448,
289
+ "worker_margin": 0.12566046594972047,
290
+ "worker_negative_mean": 0.11768462769948262,
291
+ "worker_positive_mean": 0.2433450936492031,
292
+ "worker_recall_at_0_5": 0.15384615384615385
293
+ },
294
+ "epoch": 3,
295
+ "global_step": 1401,
296
+ "saved_as_best": false,
297
+ "train": {
298
+ "loss": 0.6749851951022618
299
+ },
300
+ "validation": {
301
+ "by_persona": {
302
+ "PM": {
303
+ "distribution": {
304
+ "mean_margin": 0.4764980277625444,
305
+ "negative_max": 0.9921464323997498,
306
+ "negative_mean": 0.08573017952596511,
307
+ "negative_median": 0.007474776357412338,
308
+ "negative_min": 4.185079160379246e-05,
309
+ "positive_max": 0.9970656037330627,
310
+ "positive_mean": 0.5622282072885095,
311
+ "positive_median": 0.9007190465927124,
312
+ "positive_min": 0.0007008814136497676
313
+ },
314
+ "metrics_at_0_5": {
315
+ "f1": 0.5507246376811594,
316
+ "fn": 16,
317
+ "fp": 15,
318
+ "precision": 0.5588235294117647,
319
+ "recall": 0.5428571428571428,
320
+ "selected_rate": 0.11486486486486487,
321
+ "support_negative": 261,
322
+ "support_positive": 35,
323
+ "threshold": 0.5,
324
+ "tn": 246,
325
+ "tp": 19
326
+ },
327
+ "row_count": 296
328
+ },
329
+ "REVIEWER": {
330
+ "distribution": {
331
+ "mean_margin": 0.517838701749934,
332
+ "negative_max": 0.9929578304290771,
333
+ "negative_mean": 0.07897720237018231,
334
+ "negative_median": 0.001454136217944324,
335
+ "negative_min": 2.2266027372097597e-05,
336
+ "positive_max": 0.9976092576980591,
337
+ "positive_mean": 0.5968159041201163,
338
+ "positive_median": 0.9011349678039551,
339
+ "positive_min": 0.0005473553319461644
340
+ },
341
+ "metrics_at_0_5": {
342
+ "f1": 0.5405405405405406,
343
+ "fn": 15,
344
+ "fp": 19,
345
+ "precision": 0.5128205128205128,
346
+ "recall": 0.5714285714285714,
347
+ "selected_rate": 0.13175675675675674,
348
+ "support_negative": 261,
349
+ "support_positive": 35,
350
+ "threshold": 0.5,
351
+ "tn": 242,
352
+ "tp": 20
353
+ },
354
+ "row_count": 296
355
+ },
356
+ "WORKER": {
357
+ "distribution": {
358
+ "mean_margin": 0.12566046594972047,
359
+ "negative_max": 0.9942365288734436,
360
+ "negative_mean": 0.11768462769948262,
361
+ "negative_median": 0.0029648917261511087,
362
+ "negative_min": 2.385957668593619e-05,
363
+ "positive_max": 0.8363280892372131,
364
+ "positive_mean": 0.2433450936492031,
365
+ "positive_median": 0.07627999782562256,
366
+ "positive_min": 0.0006804398144595325
367
+ },
368
+ "metrics_at_0_5": {
369
+ "f1": 0.08163265306122448,
370
+ "fn": 11,
371
+ "fp": 34,
372
+ "precision": 0.05555555555555555,
373
+ "recall": 0.15384615384615385,
374
+ "selected_rate": 0.12162162162162163,
375
+ "support_negative": 283,
376
+ "support_positive": 13,
377
+ "threshold": 0.5,
378
+ "tn": 249,
379
+ "tp": 2
380
+ },
381
+ "row_count": 296
382
+ }
383
+ },
384
+ "overall": {
385
+ "distribution": {
386
+ "mean_margin": 0.4320934522177288,
387
+ "negative_max": 0.9942365288734436,
388
+ "negative_mean": 0.09477438051409696,
389
+ "negative_median": 0.0037272856570780277,
390
+ "negative_min": 2.2266027372097597e-05,
391
+ "positive_max": 0.9976092576980591,
392
+ "positive_mean": 0.5268678327318258,
393
+ "positive_median": 0.47975343465805054,
394
+ "positive_min": 0.0005473553319461644
395
+ },
396
+ "metrics_at_0_5": {
397
+ "f1": 0.4270833333333333,
398
+ "fn": 42,
399
+ "fp": 68,
400
+ "precision": 0.3761467889908257,
401
+ "recall": 0.4939759036144578,
402
+ "selected_rate": 0.12274774774774774,
403
+ "support_negative": 805,
404
+ "support_positive": 83,
405
+ "threshold": 0.5,
406
+ "tn": 737,
407
+ "tp": 41
408
+ }
409
+ }
410
+ },
411
+ "validation_loss": 2.1597039627709558
412
+ }
413
+ ],
414
+ "hyperparameters": {
415
+ "batch_size": 16,
416
+ "epochs": 3,
417
+ "eval_batch_size": 32,
418
+ "grad_clip": 1.0,
419
+ "learning_rate": 1e-05,
420
+ "max_document_chars": 6000,
421
+ "max_length": 512,
422
+ "negative_weight": 1.0,
423
+ "positive_weights": {
424
+ "PM": 8.0,
425
+ "REVIEWER": 10.0,
426
+ "WORKER": 18.0
427
+ },
428
+ "seed": 42,
429
+ "weight_decay": 0.01
430
+ },
431
+ "model_name": "cross-encoder/ms-marco-MiniLM-L-6-v2",
432
+ "output_dir": "models/context_projector_v3",
433
+ "schema_version": 1,
434
+ "split": {
435
+ "oversampling": {
436
+ "rows_after": 7469,
437
+ "rows_before": 7299,
438
+ "target_positive_count": 279,
439
+ "worker_positive_added": 170,
440
+ "worker_positive_before": 109
441
+ },
442
+ "train_counts_after": {
443
+ "PM_0": 2154,
444
+ "PM_1": 279,
445
+ "REVIEWER_0": 2154,
446
+ "REVIEWER_1": 279,
447
+ "WORKER_0": 2324,
448
+ "WORKER_1": 279
449
+ },
450
+ "train_counts_before": {
451
+ "PM_0": 2154,
452
+ "PM_1": 279,
453
+ "REVIEWER_0": 2154,
454
+ "REVIEWER_1": 279,
455
+ "WORKER_0": 2324,
456
+ "WORKER_1": 109
457
+ },
458
+ "train_rows_after_oversampling": 7469,
459
+ "train_rows_before_oversampling": 7299,
460
+ "val_counts": {
461
+ "PM_0": 261,
462
+ "PM_1": 35,
463
+ "REVIEWER_0": 261,
464
+ "REVIEWER_1": 35,
465
+ "WORKER_0": 283,
466
+ "WORKER_1": 13
467
+ },
468
+ "val_rows": 888,
469
+ "validation_case_slugs": [
470
+ "verified_django_11206",
471
+ "verified_django_11951",
472
+ "verified_django_13121",
473
+ "verified_django_13513",
474
+ "verified_django_13551",
475
+ "verified_django_16315"
476
+ ]
477
+ },
478
+ "success_criterion": {
479
+ "worker_positive_mean_gt_worker_negative_mean": true
480
+ },
481
+ "training_seconds": 346.8841059207916,
482
+ "validation": {
483
+ "by_persona": {
484
+ "PM": {
485
+ "distribution": {
486
+ "mean_margin": 0.4499058730856703,
487
+ "negative_max": 0.9432535171508789,
488
+ "negative_mean": 0.14573068946503617,
489
+ "negative_median": 0.030289923772215843,
490
+ "negative_min": 0.0005832742899656296,
491
+ "positive_max": 0.9651457667350769,
492
+ "positive_mean": 0.5956365625507065,
493
+ "positive_median": 0.7239232063293457,
494
+ "positive_min": 0.009630626998841763
495
+ },
496
+ "metrics_at_0_5": {
497
+ "f1": 0.5348837209302325,
498
+ "fn": 12,
499
+ "fp": 28,
500
+ "precision": 0.45098039215686275,
501
+ "recall": 0.6571428571428571,
502
+ "selected_rate": 0.17229729729729729,
503
+ "support_negative": 261,
504
+ "support_positive": 35,
505
+ "threshold": 0.5,
506
+ "tn": 233,
507
+ "tp": 23
508
+ },
509
+ "row_count": 296
510
+ },
511
+ "REVIEWER": {
512
+ "distribution": {
513
+ "mean_margin": 0.5283544086437608,
514
+ "negative_max": 0.950327455997467,
515
+ "negative_mean": 0.1310451370279235,
516
+ "negative_median": 0.020668208599090576,
517
+ "negative_min": 0.00010100839426741004,
518
+ "positive_max": 0.9813393950462341,
519
+ "positive_mean": 0.6593995456716844,
520
+ "positive_median": 0.9091708064079285,
521
+ "positive_min": 0.01855509914457798
522
+ },
523
+ "metrics_at_0_5": {
524
+ "f1": 0.5822784810126582,
525
+ "fn": 12,
526
+ "fp": 21,
527
+ "precision": 0.5227272727272727,
528
+ "recall": 0.6571428571428571,
529
+ "selected_rate": 0.14864864864864866,
530
+ "support_negative": 261,
531
+ "support_positive": 35,
532
+ "threshold": 0.5,
533
+ "tn": 240,
534
+ "tp": 23
535
+ },
536
+ "row_count": 296
537
+ },
538
+ "WORKER": {
539
+ "distribution": {
540
+ "mean_margin": 0.18419288120725416,
541
+ "negative_max": 0.9673499464988708,
542
+ "negative_mean": 0.18152672360015024,
543
+ "negative_median": 0.02779071219265461,
544
+ "negative_min": 0.00011777759937103838,
545
+ "positive_max": 0.8974487781524658,
546
+ "positive_mean": 0.3657196048074044,
547
+ "positive_median": 0.3847672939300537,
548
+ "positive_min": 0.022759659215807915
549
+ },
550
+ "metrics_at_0_5": {
551
+ "f1": 0.1694915254237288,
552
+ "fn": 8,
553
+ "fp": 41,
554
+ "precision": 0.10869565217391304,
555
+ "recall": 0.38461538461538464,
556
+ "selected_rate": 0.1554054054054054,
557
+ "support_negative": 283,
558
+ "support_positive": 13,
559
+ "threshold": 0.5,
560
+ "tn": 242,
561
+ "tp": 5
562
+ },
563
+ "row_count": 296
564
+ }
565
+ },
566
+ "overall": {
567
+ "distribution": {
568
+ "mean_margin": 0.43295999511358885,
569
+ "negative_max": 0.9673499464988708,
570
+ "negative_mean": 0.15355348260062732,
571
+ "negative_median": 0.026061803102493286,
572
+ "negative_min": 0.00010100839426741004,
573
+ "positive_max": 0.9813393950462341,
574
+ "positive_mean": 0.5865134777142161,
575
+ "positive_median": 0.7207038402557373,
576
+ "positive_min": 0.009630626998841763
577
+ },
578
+ "metrics_at_0_5": {
579
+ "f1": 0.45535714285714285,
580
+ "fn": 32,
581
+ "fp": 90,
582
+ "precision": 0.3617021276595745,
583
+ "recall": 0.6144578313253012,
584
+ "selected_rate": 0.15878378378378377,
585
+ "support_negative": 805,
586
+ "support_positive": 83,
587
+ "threshold": 0.5,
588
+ "tn": 715,
589
+ "tp": 51
590
+ }
591
+ }
592
+ }
593
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e97dbfae9fcfb25afe8a649e70dd6238667bf9d669bc0e9a5c0ec6c478fdf55c
3
+ size 90866412
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff