PuxAI commited on
Commit
de92911
·
verified ·
1 Parent(s): 415dd52

Upload ablation summaries

Browse files
mbert_paper_metrics/docs/ablation_results.csv ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ source,task,model,strategy,seed,metric,score,eval_loss,train_loss,epoch,eval_samples,path
2
+ result_ablation_mbert_paper,cola,mBERT,hf_sequence_classifier,43,eval_matthews_correlation,0.770612811219967,0.29299455881118774,0.2762503274876009,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_43/all_results.json
3
+ result_ablation_mbert_paper,cola,mBERT,hf_sequence_classifier,44,eval_matthews_correlation,0.7729513152802419,0.2829234004020691,0.2657259335027677,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_44/all_results.json
4
+ result_ablation_mbert_paper,cola,mBERT,hf_sequence_classifier,42,eval_matthews_correlation,0.7493368300485673,0.29159754514694214,0.2649622825075904,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_42/all_results.json
5
+ result_ablation_mbert_paper,cola,mBERT,cls,42,eval_matthews_correlation,0.7376962218509274,0.3118951916694641,0.26396365477659994,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_42/all_results.json
6
+ result_ablation_mbert_paper,cola,mBERT,cls,43,eval_matthews_correlation,0.7682410888691849,0.2848812937736511,0.26999439925790947,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_43/all_results.json
7
+ result_ablation_mbert_paper,cola,mBERT,cls,44,eval_matthews_correlation,0.7682418720619295,0.29706358909606934,0.25450210036518417,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_44/all_results.json
8
+ result_ablation_mbert_paper,cola,mBERT,mean,42,eval_matthews_correlation,0.7326843807709864,0.3039434850215912,0.253393465856154,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_42/all_results.json
9
+ result_ablation_mbert_paper,cola,mBERT,mean,43,eval_matthews_correlation,0.754077957159406,0.27793699502944946,0.25863299785744737,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_43/all_results.json
10
+ result_ablation_mbert_paper,cola,mBERT,mean,44,eval_matthews_correlation,0.780066626120282,0.28495022654533386,0.248950488099428,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_44/all_results.json
11
+ result_ablation_mbert_paper,cola,mBERT,max,42,eval_matthews_correlation,0.7354753979999464,0.3104800581932068,0.26581977386712285,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_42/all_results.json
12
+ result_ablation_mbert_paper,cola,mBERT,max,43,eval_matthews_correlation,0.7638553055158107,0.2829602062702179,0.26830103241394615,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_43/all_results.json
13
+ result_ablation_mbert_paper,cola,mBERT,max,44,eval_matthews_correlation,0.7660475389427814,0.2871670424938202,0.2616100898041532,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_44/all_results.json
14
+ result_ablation_mbert_paper,cola,mBERT,attention,42,eval_matthews_correlation,0.7469782739691797,0.3018324375152588,0.24882495366152946,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_42/all_results.json
15
+ result_ablation_mbert_paper,cola,mBERT,attention,43,eval_matthews_correlation,0.7658812546257014,0.28296852111816406,0.26194217346167636,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_43/all_results.json
16
+ result_ablation_mbert_paper,cola,mBERT,attention,44,eval_matthews_correlation,0.7776576303274508,0.28675714135169983,0.25152042053198886,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_44/all_results.json
17
+ result_ablation_mbert_paper,cola,mBERT,mha_attention,42,eval_matthews_correlation,0.7567278866591004,0.29563766717910767,0.2555409435913942,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_42/all_results.json
18
+ result_ablation_mbert_paper,cola,mBERT,mha_attention,43,eval_matthews_correlation,0.7658953499343086,0.28152528405189514,0.2595851941272106,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_43/all_results.json
19
+ result_ablation_mbert_paper,cola,mBERT,mha_attention,44,eval_matthews_correlation,0.7825325139082561,0.29194650053977966,0.2563083238690813,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_44/all_results.json
20
+ result_ablation_mbert_paper,cola,mBERT,multi_branch_average,42,eval_matthews_correlation,0.7659095653684762,0.3065093457698822,0.24797400135860265,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_42/all_results.json
21
+ result_ablation_mbert_paper,cola,mBERT,multi_branch_average,43,eval_matthews_correlation,0.7755378989688012,0.29216188192367554,0.2530268902347838,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_43/all_results.json
22
+ result_ablation_mbert_paper,cola,mBERT,multi_branch_average,44,eval_matthews_correlation,0.7637565208987979,0.3023347556591034,0.24633375506534755,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_44/all_results.json
23
+ result_ablation_mbert_paper,cola,mBERT,gated_multi_branch,42,eval_matthews_correlation,0.7641385489211314,0.30131927132606506,0.2518068837970959,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_42/all_results.json
24
+ result_ablation_mbert_paper,cola,mBERT,gated_multi_branch,43,eval_matthews_correlation,0.7733151048821603,0.29004186391830444,0.2568637455735251,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_43/all_results.json
25
+ result_ablation_mbert_paper,cola,mBERT,gated_multi_branch,44,eval_matthews_correlation,0.7612027683763856,0.2971993386745453,0.24624314412149684,3.0,1043,/workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_44/all_results.json
26
+ result_ablation_mbert_paper,mrpc,mBERT,hf_sequence_classifier,42,eval_combined_score,0.8253864685806063,0.4254560172557831,0.49463975602301996,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_42/all_results.json
27
+ result_ablation_mbert_paper,mrpc,mBERT,hf_sequence_classifier,43,eval_combined_score,0.8455882352941176,0.3725244998931885,0.46222704044286755,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_43/all_results.json
28
+ result_ablation_mbert_paper,mrpc,mBERT,hf_sequence_classifier,44,eval_combined_score,0.8302672780138671,0.3804396092891693,0.483004673667576,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_44/all_results.json
29
+ result_ablation_mbert_paper,mrpc,mBERT,cls,42,eval_combined_score,0.8410480428333695,0.3644275963306427,0.45598942300547723,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_42/all_results.json
30
+ result_ablation_mbert_paper,mrpc,mBERT,cls,43,eval_combined_score,0.8470165044435041,0.36292579770088196,0.46971310739931854,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_43/all_results.json
31
+ result_ablation_mbert_paper,mrpc,mBERT,cls,44,eval_combined_score,0.8299842837898519,0.3964694142341614,0.4667554800061212,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_44/all_results.json
32
+ result_ablation_mbert_paper,mrpc,mBERT,mean,42,eval_combined_score,0.8496400405180522,0.3612998425960541,0.45593016389487445,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_42/all_results.json
33
+ result_ablation_mbert_paper,mrpc,mBERT,mean,43,eval_combined_score,0.8611445944498017,0.35825616121292114,0.45004705760789954,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_43/all_results.json
34
+ result_ablation_mbert_paper,mrpc,mBERT,mean,44,eval_combined_score,0.8521589486858574,0.3471723198890686,0.471782013989877,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_44/all_results.json
35
+ result_ablation_mbert_paper,mrpc,mBERT,max,42,eval_combined_score,0.8575504828797191,0.32597723603248596,0.4347189405690069,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_42/all_results.json
36
+ result_ablation_mbert_paper,mrpc,mBERT,max,43,eval_combined_score,0.8480907445245008,0.3480868637561798,0.4460654051407524,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_43/all_results.json
37
+ result_ablation_mbert_paper,mrpc,mBERT,max,44,eval_combined_score,0.8534696406443618,0.3394975960254669,0.450698631397192,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_44/all_results.json
38
+ result_ablation_mbert_paper,mrpc,mBERT,attention,42,eval_combined_score,0.8390056022408964,0.3504122495651245,0.45668997971907904,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_42/all_results.json
39
+ result_ablation_mbert_paper,mrpc,mBERT,attention,43,eval_combined_score,0.8505793226381462,0.3571226894855499,0.4533920979154283,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_43/all_results.json
40
+ result_ablation_mbert_paper,mrpc,mBERT,attention,44,eval_combined_score,0.8534696406443618,0.3526938855648041,0.43968486094820325,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_44/all_results.json
41
+ result_ablation_mbert_paper,mrpc,mBERT,mha_attention,42,eval_combined_score,0.8476126638500704,0.3495355248451233,0.4543619294097458,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_42/all_results.json
42
+ result_ablation_mbert_paper,mrpc,mBERT,mha_attention,43,eval_combined_score,0.8642209572000843,0.35361242294311523,0.45127414620440937,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_43/all_results.json
43
+ result_ablation_mbert_paper,mrpc,mBERT,mha_attention,44,eval_combined_score,0.8602692001014824,0.34483397006988525,0.4315704953843269,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_44/all_results.json
44
+ result_ablation_mbert_paper,mrpc,mBERT,multi_branch_average,42,eval_combined_score,0.8541666666666667,0.3541111648082733,0.4266196886698405,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_42/all_results.json
45
+ result_ablation_mbert_paper,mrpc,mBERT,multi_branch_average,43,eval_combined_score,0.8521384241770102,0.3447897434234619,0.44704633519269416,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_43/all_results.json
46
+ result_ablation_mbert_paper,mrpc,mBERT,multi_branch_average,44,eval_combined_score,0.8455882352941176,0.36839744448661804,0.45383731178615405,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_44/all_results.json
47
+ result_ablation_mbert_paper,mrpc,mBERT,gated_multi_branch,42,eval_combined_score,0.8460712752254187,0.35193678736686707,0.43137245592863666,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_42/all_results.json
48
+ result_ablation_mbert_paper,mrpc,mBERT,gated_multi_branch,43,eval_combined_score,0.8387623866751002,0.3761675953865051,0.47398912733879645,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_43/all_results.json
49
+ result_ablation_mbert_paper,mrpc,mBERT,gated_multi_branch,44,eval_combined_score,0.8575317965023848,0.3612346351146698,0.4625883102416992,3.0,408,/workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_44/all_results.json
50
+ result_ablation_mbert_paper,sst2,mBERT,hf_sequence_classifier,42,eval_accuracy,0.8692660550458715,0.41583138704299927,0.25817311102685153,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_42/all_results.json
51
+ result_ablation_mbert_paper,sst2,mBERT,hf_sequence_classifier,43,eval_accuracy,0.8784403669724771,0.3952513635158539,0.2572364091684397,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_43/all_results.json
52
+ result_ablation_mbert_paper,sst2,mBERT,hf_sequence_classifier,44,eval_accuracy,0.8887614678899083,0.3691374659538269,0.2579068086115217,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_44/all_results.json
53
+ result_ablation_mbert_paper,sst2,mBERT,cls,42,eval_accuracy,0.8761467889908257,0.3886052072048187,0.258284489502533,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_42/all_results.json
54
+ result_ablation_mbert_paper,sst2,mBERT,cls,43,eval_accuracy,0.8795871559633027,0.4069834053516388,0.2556771510948006,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_43/all_results.json
55
+ result_ablation_mbert_paper,sst2,mBERT,cls,44,eval_accuracy,0.8772935779816514,0.3817909359931946,0.258519544737356,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_44/all_results.json
56
+ result_ablation_mbert_paper,sst2,mBERT,mean,42,eval_accuracy,0.8727064220183486,0.4059946835041046,0.2541694050258054,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_42/all_results.json
57
+ result_ablation_mbert_paper,sst2,mBERT,mean,43,eval_accuracy,0.8899082568807339,0.36042511463165283,0.25073873817401565,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_43/all_results.json
58
+ result_ablation_mbert_paper,sst2,mBERT,mean,44,eval_accuracy,0.8795871559633027,0.36055734753608704,0.25276644231776635,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_44/all_results.json
59
+ result_ablation_mbert_paper,sst2,mBERT,max,42,eval_accuracy,0.8704128440366973,0.39564406871795654,0.25985605001260814,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_42/all_results.json
60
+ result_ablation_mbert_paper,sst2,mBERT,max,43,eval_accuracy,0.8807339449541285,0.3865836262702942,0.2557056057764629,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_43/all_results.json
61
+ result_ablation_mbert_paper,sst2,mBERT,max,44,eval_accuracy,0.8727064220183486,0.3852188289165497,0.25833009861804906,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_44/all_results.json
62
+ result_ablation_mbert_paper,sst2,mBERT,attention,42,eval_accuracy,0.8646788990825688,0.3976602256298065,0.25434095785906646,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_42/all_results.json
63
+ result_ablation_mbert_paper,sst2,mBERT,attention,43,eval_accuracy,0.8876146788990825,0.364219605922699,0.25101400063515467,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_43/all_results.json
64
+ result_ablation_mbert_paper,sst2,mBERT,attention,44,eval_accuracy,0.8922018348623854,0.36168256402015686,0.2530433903208632,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_44/all_results.json
65
+ result_ablation_mbert_paper,sst2,mBERT,mha_attention,42,eval_accuracy,0.8738532110091743,0.3983316421508789,0.255206602108639,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_42/all_results.json
66
+ result_ablation_mbert_paper,sst2,mBERT,mha_attention,43,eval_accuracy,0.8772935779816514,0.3741566240787506,0.25444065910997793,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_43/all_results.json
67
+ result_ablation_mbert_paper,sst2,mBERT,mha_attention,44,eval_accuracy,0.8772935779816514,0.37144818902015686,0.2535441892069668,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_44/all_results.json
68
+ result_ablation_mbert_paper,sst2,mBERT,multi_branch_average,42,eval_accuracy,0.8704128440366973,0.3955666124820709,0.2517735588106954,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_42/all_results.json
69
+ result_ablation_mbert_paper,sst2,mBERT,multi_branch_average,43,eval_accuracy,0.8956422018348624,0.3477514088153839,0.25490886166467613,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_43/all_results.json
70
+ result_ablation_mbert_paper,sst2,mBERT,multi_branch_average,44,eval_accuracy,0.8784403669724771,0.37326204776763916,0.2546568879605472,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_44/all_results.json
71
+ result_ablation_mbert_paper,sst2,mBERT,gated_multi_branch,42,eval_accuracy,0.8727064220183486,0.3894171714782715,0.25268540642890114,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_42/all_results.json
72
+ result_ablation_mbert_paper,sst2,mBERT,gated_multi_branch,43,eval_accuracy,0.8795871559633027,0.3644150495529175,0.25354172120656837,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_43/all_results.json
73
+ result_ablation_mbert_paper,sst2,mBERT,gated_multi_branch,44,eval_accuracy,0.875,0.3715299665927887,0.2534543433457448,3.0,872,/workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_44/all_results.json
74
+ result_ablation_mbert_paper,vsfc,mBERT,hf_sequence_classifier,42,eval_accuracy,0.932406822488945,0.22124937176704407,0.2343874844637784,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/hf_sequence_classifier/seed_42/all_results.json
75
+ result_ablation_mbert_paper,vsfc,mBERT,hf_sequence_classifier,43,eval_accuracy,0.9317751105495894,0.2234826385974884,0.2347770626450474,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/hf_sequence_classifier/seed_43/all_results.json
76
+ result_ablation_mbert_paper,vsfc,mBERT,cls,42,eval_accuracy,0.932406822488945,0.21786975860595703,0.22804758987782442,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/cls/seed_42/all_results.json
77
+ result_ablation_mbert_paper,vsfc,mBERT,cls,44,eval_accuracy,0.934301958307012,0.21943630278110504,0.232700464608786,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/cls/seed_44/all_results.json
78
+ result_ablation_mbert_paper,vsfc,mBERT,mean,42,eval_accuracy,0.9330385344283006,0.2111007124185562,0.2191808607194807,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/mean/seed_42/all_results.json
79
+ result_ablation_mbert_paper,vsfc,mBERT,max,42,eval_accuracy,0.9336702463676564,0.2267763763666153,0.23415064422678558,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/max/seed_42/all_results.json
80
+ result_ablation_mbert_paper,vsfc,mBERT,max,43,eval_accuracy,0.9349336702463676,0.22572965919971466,0.23497871141055804,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/max/seed_43/all_results.json
81
+ result_ablation_mbert_paper,vsfc,mBERT,attention,44,eval_accuracy,0.9368288060644346,0.20899568498134613,0.22297021336766668,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/attention/seed_44/all_results.json
82
+ result_ablation_mbert_paper,vsfc,mBERT,mha_attention,42,eval_accuracy,0.9330385344283006,0.21529895067214966,0.22224652350365698,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_42/all_results.json
83
+ result_ablation_mbert_paper,vsfc,mBERT,mha_attention,43,eval_accuracy,0.9330385344283006,0.21815043687820435,0.22466682554124953,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_43/all_results.json
84
+ result_ablation_mbert_paper,vsfc,mBERT,mha_attention,44,eval_accuracy,0.934301958307012,0.21530689299106598,0.22445858664168067,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_44/all_results.json
85
+ result_ablation_mbert_paper,vsfc,mBERT,multi_branch_average,42,eval_accuracy,0.9355653821857233,0.2141973078250885,0.21831525011218234,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/multi_branch_average/seed_42/all_results.json
86
+ result_ablation_mbert_paper,vsfc,mBERT,gated_multi_branch,44,eval_accuracy,0.9279848389134555,0.21946457028388977,0.2245002004094335,3.0,1583,/workspace/result_ablation_mbert_paper/vsfc/mBERT/gated_multi_branch/seed_44/all_results.json
mbert_paper_metrics/docs/ablation_results_aggregate.csv ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ source,task,model,strategy,metric,n,mean,std,min,max
2
+ result_ablation_mbert_paper,cola,mBERT,attention,eval_matthews_correlation,3,0.7635057196407773,0.01547701849478434,0.7469782739691797,0.7776576303274508
3
+ result_ablation_mbert_paper,cola,mBERT,cls,eval_matthews_correlation,3,0.758059727594014,0.01763531328797097,0.7376962218509274,0.7682418720619295
4
+ result_ablation_mbert_paper,cola,mBERT,gated_multi_branch,eval_matthews_correlation,3,0.7662188073932258,0.00631844762503582,0.7612027683763856,0.7733151048821603
5
+ result_ablation_mbert_paper,cola,mBERT,hf_sequence_classifier,eval_matthews_correlation,3,0.764300318849592,0.013011404541162163,0.7493368300485673,0.7729513152802419
6
+ result_ablation_mbert_paper,cola,mBERT,max,eval_matthews_correlation,3,0.7551260808195128,0.017053254038628063,0.7354753979999464,0.7660475389427814
7
+ result_ablation_mbert_paper,cola,mBERT,mean,eval_matthews_correlation,3,0.7556096546835581,0.023728229317931226,0.7326843807709864,0.780066626120282
8
+ result_ablation_mbert_paper,cola,mBERT,mha_attention,eval_matthews_correlation,3,0.7683852501672217,0.013081261378183762,0.7567278866591004,0.7825325139082561
9
+ result_ablation_mbert_paper,cola,mBERT,multi_branch_average,eval_matthews_correlation,3,0.7684013284120251,0.006273506165294328,0.7637565208987979,0.7755378989688012
10
+ result_ablation_mbert_paper,mrpc,mBERT,attention,eval_combined_score,3,0.8476848551744681,0.0076541203386116755,0.8390056022408964,0.8534696406443618
11
+ result_ablation_mbert_paper,mrpc,mBERT,cls,eval_combined_score,3,0.8393496103555752,0.008642201094622479,0.8299842837898519,0.8470165044435041
12
+ result_ablation_mbert_paper,mrpc,mBERT,gated_multi_branch,eval_combined_score,3,0.8474551528009678,0.009460920894618174,0.8387623866751002,0.8575317965023848
13
+ result_ablation_mbert_paper,mrpc,mBERT,hf_sequence_classifier,eval_combined_score,3,0.833747327296197,0.010540915607401833,0.8253864685806063,0.8455882352941176
14
+ result_ablation_mbert_paper,mrpc,mBERT,max,eval_combined_score,3,0.8530369560161939,0.0047446890759971425,0.8480907445245008,0.8575504828797191
15
+ result_ablation_mbert_paper,mrpc,mBERT,mean,eval_combined_score,3,0.8543145278845704,0.006047609573507266,0.8496400405180522,0.8611445944498017
16
+ result_ablation_mbert_paper,mrpc,mBERT,mha_attention,eval_combined_score,3,0.8573676070505457,0.008676017731365136,0.8476126638500704,0.8642209572000843
17
+ result_ablation_mbert_paper,mrpc,mBERT,multi_branch_average,eval_combined_score,3,0.8506311087125982,0.004483455267461182,0.8455882352941176,0.8541666666666667
18
+ result_ablation_mbert_paper,sst2,mBERT,attention,eval_accuracy,3,0.8814984709480123,0.014745643365432668,0.8646788990825688,0.8922018348623854
19
+ result_ablation_mbert_paper,sst2,mBERT,cls,eval_accuracy,3,0.8776758409785933,0.001751749118866907,0.8761467889908257,0.8795871559633027
20
+ result_ablation_mbert_paper,sst2,mBERT,gated_multi_branch,eval_accuracy,3,0.8757645259938838,0.003503498237733826,0.8727064220183486,0.8795871559633027
21
+ result_ablation_mbert_paper,sst2,mBERT,hf_sequence_classifier,eval_accuracy,3,0.878822629969419,0.009753326316646119,0.8692660550458715,0.8887614678899083
22
+ result_ablation_mbert_paper,sst2,mBERT,max,eval_accuracy,3,0.8746177370030581,0.00541951333285852,0.8704128440366973,0.8807339449541285
23
+ result_ablation_mbert_paper,sst2,mBERT,mean,eval_accuracy,3,0.8807339449541284,0.00865806701292519,0.8727064220183486,0.8899082568807339
24
+ result_ablation_mbert_paper,sst2,mBERT,mha_attention,eval_accuracy,3,0.8761467889908258,0.001986296797670735,0.8738532110091743,0.8772935779816514
25
+ result_ablation_mbert_paper,sst2,mBERT,multi_branch_average,eval_accuracy,3,0.8814984709480123,0.01288969059639708,0.8704128440366973,0.8956422018348624
26
+ result_ablation_mbert_paper,vsfc,mBERT,attention,eval_accuracy,1,0.9368288060644346,0.0,0.9368288060644346,0.9368288060644346
27
+ result_ablation_mbert_paper,vsfc,mBERT,cls,eval_accuracy,2,0.9333543903979785,0.0013400633882246875,0.932406822488945,0.934301958307012
28
+ result_ablation_mbert_paper,vsfc,mBERT,gated_multi_branch,eval_accuracy,1,0.9279848389134555,0.0,0.9279848389134555,0.9279848389134555
29
+ result_ablation_mbert_paper,vsfc,mBERT,hf_sequence_classifier,eval_accuracy,2,0.9320909665192672,0.0004466877960748697,0.9317751105495894,0.932406822488945
30
+ result_ablation_mbert_paper,vsfc,mBERT,max,eval_accuracy,2,0.934301958307012,0.0008933755921497394,0.9336702463676564,0.9349336702463676
31
+ result_ablation_mbert_paper,vsfc,mBERT,mean,eval_accuracy,1,0.9330385344283006,0.0,0.9330385344283006,0.9330385344283006
32
+ result_ablation_mbert_paper,vsfc,mBERT,mha_attention,eval_accuracy,3,0.9334596757212045,0.0007294381164746089,0.9330385344283006,0.934301958307012
33
+ result_ablation_mbert_paper,vsfc,mBERT,multi_branch_average,eval_accuracy,1,0.9355653821857233,0.0,0.9355653821857233,0.9355653821857233
mbert_paper_metrics/docs/ablation_summary.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ablation Result Summary
2
+
3
+ Main metric is selected per task: CoLA uses Matthews correlation; MRPC/QQP/STSB use combined GLUE score when available; classification tasks use accuracy.
4
+
5
+ ## Aggregated Results
6
+ | source | task | model | strategy | metric | n | mean | std | min | max |
7
+ | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
8
+ | result_ablation_mbert_paper | cola | mBERT | attention | eval_matthews_correlation | 3 | 0.7635 | 0.0155 | 0.7470 | 0.7777 |
9
+ | result_ablation_mbert_paper | cola | mBERT | cls | eval_matthews_correlation | 3 | 0.7581 | 0.0176 | 0.7377 | 0.7682 |
10
+ | result_ablation_mbert_paper | cola | mBERT | gated_multi_branch | eval_matthews_correlation | 3 | 0.7662 | 0.0063 | 0.7612 | 0.7733 |
11
+ | result_ablation_mbert_paper | cola | mBERT | hf_sequence_classifier | eval_matthews_correlation | 3 | 0.7643 | 0.0130 | 0.7493 | 0.7730 |
12
+ | result_ablation_mbert_paper | cola | mBERT | max | eval_matthews_correlation | 3 | 0.7551 | 0.0171 | 0.7355 | 0.7660 |
13
+ | result_ablation_mbert_paper | cola | mBERT | mean | eval_matthews_correlation | 3 | 0.7556 | 0.0237 | 0.7327 | 0.7801 |
14
+ | result_ablation_mbert_paper | cola | mBERT | mha_attention | eval_matthews_correlation | 3 | 0.7684 | 0.0131 | 0.7567 | 0.7825 |
15
+ | result_ablation_mbert_paper | cola | mBERT | multi_branch_average | eval_matthews_correlation | 3 | 0.7684 | 0.0063 | 0.7638 | 0.7755 |
16
+ | result_ablation_mbert_paper | mrpc | mBERT | attention | eval_combined_score | 3 | 0.8477 | 0.0077 | 0.8390 | 0.8535 |
17
+ | result_ablation_mbert_paper | mrpc | mBERT | cls | eval_combined_score | 3 | 0.8393 | 0.0086 | 0.8300 | 0.8470 |
18
+ | result_ablation_mbert_paper | mrpc | mBERT | gated_multi_branch | eval_combined_score | 3 | 0.8475 | 0.0095 | 0.8388 | 0.8575 |
19
+ | result_ablation_mbert_paper | mrpc | mBERT | hf_sequence_classifier | eval_combined_score | 3 | 0.8337 | 0.0105 | 0.8254 | 0.8456 |
20
+ | result_ablation_mbert_paper | mrpc | mBERT | max | eval_combined_score | 3 | 0.8530 | 0.0047 | 0.8481 | 0.8576 |
21
+ | result_ablation_mbert_paper | mrpc | mBERT | mean | eval_combined_score | 3 | 0.8543 | 0.0060 | 0.8496 | 0.8611 |
22
+ | result_ablation_mbert_paper | mrpc | mBERT | mha_attention | eval_combined_score | 3 | 0.8574 | 0.0087 | 0.8476 | 0.8642 |
23
+ | result_ablation_mbert_paper | mrpc | mBERT | multi_branch_average | eval_combined_score | 3 | 0.8506 | 0.0045 | 0.8456 | 0.8542 |
24
+ | result_ablation_mbert_paper | sst2 | mBERT | attention | eval_accuracy | 3 | 0.8815 | 0.0147 | 0.8647 | 0.8922 |
25
+ | result_ablation_mbert_paper | sst2 | mBERT | cls | eval_accuracy | 3 | 0.8777 | 0.0018 | 0.8761 | 0.8796 |
26
+ | result_ablation_mbert_paper | sst2 | mBERT | gated_multi_branch | eval_accuracy | 3 | 0.8758 | 0.0035 | 0.8727 | 0.8796 |
27
+ | result_ablation_mbert_paper | sst2 | mBERT | hf_sequence_classifier | eval_accuracy | 3 | 0.8788 | 0.0098 | 0.8693 | 0.8888 |
28
+ | result_ablation_mbert_paper | sst2 | mBERT | max | eval_accuracy | 3 | 0.8746 | 0.0054 | 0.8704 | 0.8807 |
29
+ | result_ablation_mbert_paper | sst2 | mBERT | mean | eval_accuracy | 3 | 0.8807 | 0.0087 | 0.8727 | 0.8899 |
30
+ | result_ablation_mbert_paper | sst2 | mBERT | mha_attention | eval_accuracy | 3 | 0.8761 | 0.0020 | 0.8739 | 0.8773 |
31
+ | result_ablation_mbert_paper | sst2 | mBERT | multi_branch_average | eval_accuracy | 3 | 0.8815 | 0.0129 | 0.8704 | 0.8956 |
32
+ | result_ablation_mbert_paper | vsfc | mBERT | attention | eval_accuracy | 1 | 0.9368 | 0.0000 | 0.9368 | 0.9368 |
33
+ | result_ablation_mbert_paper | vsfc | mBERT | cls | eval_accuracy | 2 | 0.9334 | 0.0013 | 0.9324 | 0.9343 |
34
+ | result_ablation_mbert_paper | vsfc | mBERT | gated_multi_branch | eval_accuracy | 1 | 0.9280 | 0.0000 | 0.9280 | 0.9280 |
35
+ | result_ablation_mbert_paper | vsfc | mBERT | hf_sequence_classifier | eval_accuracy | 2 | 0.9321 | 0.0004 | 0.9318 | 0.9324 |
36
+ | result_ablation_mbert_paper | vsfc | mBERT | max | eval_accuracy | 2 | 0.9343 | 0.0009 | 0.9337 | 0.9349 |
37
+ | result_ablation_mbert_paper | vsfc | mBERT | mean | eval_accuracy | 1 | 0.9330 | 0.0000 | 0.9330 | 0.9330 |
38
+ | result_ablation_mbert_paper | vsfc | mBERT | mha_attention | eval_accuracy | 3 | 0.9335 | 0.0007 | 0.9330 | 0.9343 |
39
+ | result_ablation_mbert_paper | vsfc | mBERT | multi_branch_average | eval_accuracy | 1 | 0.9356 | 0.0000 | 0.9356 | 0.9356 |
40
+
41
+ ## Gated Multi-Branch Deltas
42
+ | source | task | model | baseline | gated_mean | baseline_mean | delta |
43
+ | --- | --- | --- | --- | --- | --- | --- |
44
+ | result_ablation_mbert_paper | cola | mBERT | attention | 0.7662 | 0.7635 | 0.0027 |
45
+ | result_ablation_mbert_paper | cola | mBERT | mha_attention | 0.7662 | 0.7684 | -0.0022 |
46
+ | result_ablation_mbert_paper | cola | mBERT | multi_branch_average | 0.7662 | 0.7684 | -0.0022 |
47
+ | result_ablation_mbert_paper | cola | mBERT | hf_sequence_classifier | 0.7662 | 0.7643 | 0.0019 |
48
+ | result_ablation_mbert_paper | mrpc | mBERT | attention | 0.8475 | 0.8477 | -0.0002 |
49
+ | result_ablation_mbert_paper | mrpc | mBERT | mha_attention | 0.8475 | 0.8574 | -0.0099 |
50
+ | result_ablation_mbert_paper | mrpc | mBERT | multi_branch_average | 0.8475 | 0.8506 | -0.0032 |
51
+ | result_ablation_mbert_paper | mrpc | mBERT | hf_sequence_classifier | 0.8475 | 0.8337 | 0.0137 |
52
+ | result_ablation_mbert_paper | sst2 | mBERT | attention | 0.8758 | 0.8815 | -0.0057 |
53
+ | result_ablation_mbert_paper | sst2 | mBERT | mha_attention | 0.8758 | 0.8761 | -0.0004 |
54
+ | result_ablation_mbert_paper | sst2 | mBERT | multi_branch_average | 0.8758 | 0.8815 | -0.0057 |
55
+ | result_ablation_mbert_paper | sst2 | mBERT | hf_sequence_classifier | 0.8758 | 0.8788 | -0.0031 |
56
+ | result_ablation_mbert_paper | vsfc | mBERT | attention | 0.9280 | 0.9368 | -0.0088 |
57
+ | result_ablation_mbert_paper | vsfc | mBERT | mha_attention | 0.9280 | 0.9335 | -0.0055 |
58
+ | result_ablation_mbert_paper | vsfc | mBERT | multi_branch_average | 0.9280 | 0.9356 | -0.0076 |
59
+ | result_ablation_mbert_paper | vsfc | mBERT | hf_sequence_classifier | 0.9280 | 0.9321 | -0.0041 |
60
+
61
+ ## Raw Runs
62
+ | source | task | model | strategy | seed | metric | score | eval_loss | train_loss | epoch | eval_samples | path |
63
+ | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
64
+ | result_ablation_mbert_paper | cola | mBERT | hf_sequence_classifier | 43.0000 | eval_matthews_correlation | 0.7706 | 0.2930 | 0.2763 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_43/all_results.json |
65
+ | result_ablation_mbert_paper | cola | mBERT | hf_sequence_classifier | 44.0000 | eval_matthews_correlation | 0.7730 | 0.2829 | 0.2657 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_44/all_results.json |
66
+ | result_ablation_mbert_paper | cola | mBERT | hf_sequence_classifier | 42.0000 | eval_matthews_correlation | 0.7493 | 0.2916 | 0.2650 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/hf_sequence_classifier/seed_42/all_results.json |
67
+ | result_ablation_mbert_paper | cola | mBERT | cls | 42.0000 | eval_matthews_correlation | 0.7377 | 0.3119 | 0.2640 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_42/all_results.json |
68
+ | result_ablation_mbert_paper | cola | mBERT | cls | 43.0000 | eval_matthews_correlation | 0.7682 | 0.2849 | 0.2700 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_43/all_results.json |
69
+ | result_ablation_mbert_paper | cola | mBERT | cls | 44.0000 | eval_matthews_correlation | 0.7682 | 0.2971 | 0.2545 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/cls/seed_44/all_results.json |
70
+ | result_ablation_mbert_paper | cola | mBERT | mean | 42.0000 | eval_matthews_correlation | 0.7327 | 0.3039 | 0.2534 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_42/all_results.json |
71
+ | result_ablation_mbert_paper | cola | mBERT | mean | 43.0000 | eval_matthews_correlation | 0.7541 | 0.2779 | 0.2586 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_43/all_results.json |
72
+ | result_ablation_mbert_paper | cola | mBERT | mean | 44.0000 | eval_matthews_correlation | 0.7801 | 0.2850 | 0.2490 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mean/seed_44/all_results.json |
73
+ | result_ablation_mbert_paper | cola | mBERT | max | 42.0000 | eval_matthews_correlation | 0.7355 | 0.3105 | 0.2658 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_42/all_results.json |
74
+ | result_ablation_mbert_paper | cola | mBERT | max | 43.0000 | eval_matthews_correlation | 0.7639 | 0.2830 | 0.2683 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_43/all_results.json |
75
+ | result_ablation_mbert_paper | cola | mBERT | max | 44.0000 | eval_matthews_correlation | 0.7660 | 0.2872 | 0.2616 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/max/seed_44/all_results.json |
76
+ | result_ablation_mbert_paper | cola | mBERT | attention | 42.0000 | eval_matthews_correlation | 0.7470 | 0.3018 | 0.2488 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_42/all_results.json |
77
+ | result_ablation_mbert_paper | cola | mBERT | attention | 43.0000 | eval_matthews_correlation | 0.7659 | 0.2830 | 0.2619 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_43/all_results.json |
78
+ | result_ablation_mbert_paper | cola | mBERT | attention | 44.0000 | eval_matthews_correlation | 0.7777 | 0.2868 | 0.2515 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/attention/seed_44/all_results.json |
79
+ | result_ablation_mbert_paper | cola | mBERT | mha_attention | 42.0000 | eval_matthews_correlation | 0.7567 | 0.2956 | 0.2555 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_42/all_results.json |
80
+ | result_ablation_mbert_paper | cola | mBERT | mha_attention | 43.0000 | eval_matthews_correlation | 0.7659 | 0.2815 | 0.2596 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_43/all_results.json |
81
+ | result_ablation_mbert_paper | cola | mBERT | mha_attention | 44.0000 | eval_matthews_correlation | 0.7825 | 0.2919 | 0.2563 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/mha_attention/seed_44/all_results.json |
82
+ | result_ablation_mbert_paper | cola | mBERT | multi_branch_average | 42.0000 | eval_matthews_correlation | 0.7659 | 0.3065 | 0.2480 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_42/all_results.json |
83
+ | result_ablation_mbert_paper | cola | mBERT | multi_branch_average | 43.0000 | eval_matthews_correlation | 0.7755 | 0.2922 | 0.2530 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_43/all_results.json |
84
+ | result_ablation_mbert_paper | cola | mBERT | multi_branch_average | 44.0000 | eval_matthews_correlation | 0.7638 | 0.3023 | 0.2463 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/multi_branch_average/seed_44/all_results.json |
85
+ | result_ablation_mbert_paper | cola | mBERT | gated_multi_branch | 42.0000 | eval_matthews_correlation | 0.7641 | 0.3013 | 0.2518 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_42/all_results.json |
86
+ | result_ablation_mbert_paper | cola | mBERT | gated_multi_branch | 43.0000 | eval_matthews_correlation | 0.7733 | 0.2900 | 0.2569 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_43/all_results.json |
87
+ | result_ablation_mbert_paper | cola | mBERT | gated_multi_branch | 44.0000 | eval_matthews_correlation | 0.7612 | 0.2972 | 0.2462 | 3.0000 | 1043 | /workspace/result_ablation_mbert_paper/cola/mBERT/gated_multi_branch/seed_44/all_results.json |
88
+ | result_ablation_mbert_paper | mrpc | mBERT | hf_sequence_classifier | 42.0000 | eval_combined_score | 0.8254 | 0.4255 | 0.4946 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_42/all_results.json |
89
+ | result_ablation_mbert_paper | mrpc | mBERT | hf_sequence_classifier | 43.0000 | eval_combined_score | 0.8456 | 0.3725 | 0.4622 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_43/all_results.json |
90
+ | result_ablation_mbert_paper | mrpc | mBERT | hf_sequence_classifier | 44.0000 | eval_combined_score | 0.8303 | 0.3804 | 0.4830 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/hf_sequence_classifier/seed_44/all_results.json |
91
+ | result_ablation_mbert_paper | mrpc | mBERT | cls | 42.0000 | eval_combined_score | 0.8410 | 0.3644 | 0.4560 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_42/all_results.json |
92
+ | result_ablation_mbert_paper | mrpc | mBERT | cls | 43.0000 | eval_combined_score | 0.8470 | 0.3629 | 0.4697 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_43/all_results.json |
93
+ | result_ablation_mbert_paper | mrpc | mBERT | cls | 44.0000 | eval_combined_score | 0.8300 | 0.3965 | 0.4668 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/cls/seed_44/all_results.json |
94
+ | result_ablation_mbert_paper | mrpc | mBERT | mean | 42.0000 | eval_combined_score | 0.8496 | 0.3613 | 0.4559 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_42/all_results.json |
95
+ | result_ablation_mbert_paper | mrpc | mBERT | mean | 43.0000 | eval_combined_score | 0.8611 | 0.3583 | 0.4500 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_43/all_results.json |
96
+ | result_ablation_mbert_paper | mrpc | mBERT | mean | 44.0000 | eval_combined_score | 0.8522 | 0.3472 | 0.4718 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mean/seed_44/all_results.json |
97
+ | result_ablation_mbert_paper | mrpc | mBERT | max | 42.0000 | eval_combined_score | 0.8576 | 0.3260 | 0.4347 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_42/all_results.json |
98
+ | result_ablation_mbert_paper | mrpc | mBERT | max | 43.0000 | eval_combined_score | 0.8481 | 0.3481 | 0.4461 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_43/all_results.json |
99
+ | result_ablation_mbert_paper | mrpc | mBERT | max | 44.0000 | eval_combined_score | 0.8535 | 0.3395 | 0.4507 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/max/seed_44/all_results.json |
100
+ | result_ablation_mbert_paper | mrpc | mBERT | attention | 42.0000 | eval_combined_score | 0.8390 | 0.3504 | 0.4567 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_42/all_results.json |
101
+ | result_ablation_mbert_paper | mrpc | mBERT | attention | 43.0000 | eval_combined_score | 0.8506 | 0.3571 | 0.4534 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_43/all_results.json |
102
+ | result_ablation_mbert_paper | mrpc | mBERT | attention | 44.0000 | eval_combined_score | 0.8535 | 0.3527 | 0.4397 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/attention/seed_44/all_results.json |
103
+ | result_ablation_mbert_paper | mrpc | mBERT | mha_attention | 42.0000 | eval_combined_score | 0.8476 | 0.3495 | 0.4544 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_42/all_results.json |
104
+ | result_ablation_mbert_paper | mrpc | mBERT | mha_attention | 43.0000 | eval_combined_score | 0.8642 | 0.3536 | 0.4513 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_43/all_results.json |
105
+ | result_ablation_mbert_paper | mrpc | mBERT | mha_attention | 44.0000 | eval_combined_score | 0.8603 | 0.3448 | 0.4316 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/mha_attention/seed_44/all_results.json |
106
+ | result_ablation_mbert_paper | mrpc | mBERT | multi_branch_average | 42.0000 | eval_combined_score | 0.8542 | 0.3541 | 0.4266 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_42/all_results.json |
107
+ | result_ablation_mbert_paper | mrpc | mBERT | multi_branch_average | 43.0000 | eval_combined_score | 0.8521 | 0.3448 | 0.4470 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_43/all_results.json |
108
+ | result_ablation_mbert_paper | mrpc | mBERT | multi_branch_average | 44.0000 | eval_combined_score | 0.8456 | 0.3684 | 0.4538 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/multi_branch_average/seed_44/all_results.json |
109
+ | result_ablation_mbert_paper | mrpc | mBERT | gated_multi_branch | 42.0000 | eval_combined_score | 0.8461 | 0.3519 | 0.4314 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_42/all_results.json |
110
+ | result_ablation_mbert_paper | mrpc | mBERT | gated_multi_branch | 43.0000 | eval_combined_score | 0.8388 | 0.3762 | 0.4740 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_43/all_results.json |
111
+ | result_ablation_mbert_paper | mrpc | mBERT | gated_multi_branch | 44.0000 | eval_combined_score | 0.8575 | 0.3612 | 0.4626 | 3.0000 | 408 | /workspace/result_ablation_mbert_paper/mrpc/mBERT/gated_multi_branch/seed_44/all_results.json |
112
+ | result_ablation_mbert_paper | sst2 | mBERT | hf_sequence_classifier | 42.0000 | eval_accuracy | 0.8693 | 0.4158 | 0.2582 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_42/all_results.json |
113
+ | result_ablation_mbert_paper | sst2 | mBERT | hf_sequence_classifier | 43.0000 | eval_accuracy | 0.8784 | 0.3953 | 0.2572 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_43/all_results.json |
114
+ | result_ablation_mbert_paper | sst2 | mBERT | hf_sequence_classifier | 44.0000 | eval_accuracy | 0.8888 | 0.3691 | 0.2579 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/hf_sequence_classifier/seed_44/all_results.json |
115
+ | result_ablation_mbert_paper | sst2 | mBERT | cls | 42.0000 | eval_accuracy | 0.8761 | 0.3886 | 0.2583 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_42/all_results.json |
116
+ | result_ablation_mbert_paper | sst2 | mBERT | cls | 43.0000 | eval_accuracy | 0.8796 | 0.4070 | 0.2557 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_43/all_results.json |
117
+ | result_ablation_mbert_paper | sst2 | mBERT | cls | 44.0000 | eval_accuracy | 0.8773 | 0.3818 | 0.2585 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/cls/seed_44/all_results.json |
118
+ | result_ablation_mbert_paper | sst2 | mBERT | mean | 42.0000 | eval_accuracy | 0.8727 | 0.4060 | 0.2542 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_42/all_results.json |
119
+ | result_ablation_mbert_paper | sst2 | mBERT | mean | 43.0000 | eval_accuracy | 0.8899 | 0.3604 | 0.2507 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_43/all_results.json |
120
+ | result_ablation_mbert_paper | sst2 | mBERT | mean | 44.0000 | eval_accuracy | 0.8796 | 0.3606 | 0.2528 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mean/seed_44/all_results.json |
121
+ | result_ablation_mbert_paper | sst2 | mBERT | max | 42.0000 | eval_accuracy | 0.8704 | 0.3956 | 0.2599 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_42/all_results.json |
122
+ | result_ablation_mbert_paper | sst2 | mBERT | max | 43.0000 | eval_accuracy | 0.8807 | 0.3866 | 0.2557 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_43/all_results.json |
123
+ | result_ablation_mbert_paper | sst2 | mBERT | max | 44.0000 | eval_accuracy | 0.8727 | 0.3852 | 0.2583 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/max/seed_44/all_results.json |
124
+ | result_ablation_mbert_paper | sst2 | mBERT | attention | 42.0000 | eval_accuracy | 0.8647 | 0.3977 | 0.2543 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_42/all_results.json |
125
+ | result_ablation_mbert_paper | sst2 | mBERT | attention | 43.0000 | eval_accuracy | 0.8876 | 0.3642 | 0.2510 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_43/all_results.json |
126
+ | result_ablation_mbert_paper | sst2 | mBERT | attention | 44.0000 | eval_accuracy | 0.8922 | 0.3617 | 0.2530 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/attention/seed_44/all_results.json |
127
+ | result_ablation_mbert_paper | sst2 | mBERT | mha_attention | 42.0000 | eval_accuracy | 0.8739 | 0.3983 | 0.2552 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_42/all_results.json |
128
+ | result_ablation_mbert_paper | sst2 | mBERT | mha_attention | 43.0000 | eval_accuracy | 0.8773 | 0.3742 | 0.2544 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_43/all_results.json |
129
+ | result_ablation_mbert_paper | sst2 | mBERT | mha_attention | 44.0000 | eval_accuracy | 0.8773 | 0.3714 | 0.2535 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/mha_attention/seed_44/all_results.json |
130
+ | result_ablation_mbert_paper | sst2 | mBERT | multi_branch_average | 42.0000 | eval_accuracy | 0.8704 | 0.3956 | 0.2518 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_42/all_results.json |
131
+ | result_ablation_mbert_paper | sst2 | mBERT | multi_branch_average | 43.0000 | eval_accuracy | 0.8956 | 0.3478 | 0.2549 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_43/all_results.json |
132
+ | result_ablation_mbert_paper | sst2 | mBERT | multi_branch_average | 44.0000 | eval_accuracy | 0.8784 | 0.3733 | 0.2547 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/multi_branch_average/seed_44/all_results.json |
133
+ | result_ablation_mbert_paper | sst2 | mBERT | gated_multi_branch | 42.0000 | eval_accuracy | 0.8727 | 0.3894 | 0.2527 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_42/all_results.json |
134
+ | result_ablation_mbert_paper | sst2 | mBERT | gated_multi_branch | 43.0000 | eval_accuracy | 0.8796 | 0.3644 | 0.2535 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_43/all_results.json |
135
+ | result_ablation_mbert_paper | sst2 | mBERT | gated_multi_branch | 44.0000 | eval_accuracy | 0.8750 | 0.3715 | 0.2535 | 3.0000 | 872 | /workspace/result_ablation_mbert_paper/sst2/mBERT/gated_multi_branch/seed_44/all_results.json |
136
+ | result_ablation_mbert_paper | vsfc | mBERT | hf_sequence_classifier | 42.0000 | eval_accuracy | 0.9324 | 0.2212 | 0.2344 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/hf_sequence_classifier/seed_42/all_results.json |
137
+ | result_ablation_mbert_paper | vsfc | mBERT | hf_sequence_classifier | 43.0000 | eval_accuracy | 0.9318 | 0.2235 | 0.2348 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/hf_sequence_classifier/seed_43/all_results.json |
138
+ | result_ablation_mbert_paper | vsfc | mBERT | cls | 42.0000 | eval_accuracy | 0.9324 | 0.2179 | 0.2280 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/cls/seed_42/all_results.json |
139
+ | result_ablation_mbert_paper | vsfc | mBERT | cls | 44.0000 | eval_accuracy | 0.9343 | 0.2194 | 0.2327 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/cls/seed_44/all_results.json |
140
+ | result_ablation_mbert_paper | vsfc | mBERT | mean | 42.0000 | eval_accuracy | 0.9330 | 0.2111 | 0.2192 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/mean/seed_42/all_results.json |
141
+ | result_ablation_mbert_paper | vsfc | mBERT | max | 42.0000 | eval_accuracy | 0.9337 | 0.2268 | 0.2342 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/max/seed_42/all_results.json |
142
+ | result_ablation_mbert_paper | vsfc | mBERT | max | 43.0000 | eval_accuracy | 0.9349 | 0.2257 | 0.2350 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/max/seed_43/all_results.json |
143
+ | result_ablation_mbert_paper | vsfc | mBERT | attention | 44.0000 | eval_accuracy | 0.9368 | 0.2090 | 0.2230 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/attention/seed_44/all_results.json |
144
+ | result_ablation_mbert_paper | vsfc | mBERT | mha_attention | 42.0000 | eval_accuracy | 0.9330 | 0.2153 | 0.2222 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_42/all_results.json |
145
+ | result_ablation_mbert_paper | vsfc | mBERT | mha_attention | 43.0000 | eval_accuracy | 0.9330 | 0.2182 | 0.2247 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_43/all_results.json |
146
+ | result_ablation_mbert_paper | vsfc | mBERT | mha_attention | 44.0000 | eval_accuracy | 0.9343 | 0.2153 | 0.2245 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/mha_attention/seed_44/all_results.json |
147
+ | result_ablation_mbert_paper | vsfc | mBERT | multi_branch_average | 42.0000 | eval_accuracy | 0.9356 | 0.2142 | 0.2183 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/multi_branch_average/seed_42/all_results.json |
148
+ | result_ablation_mbert_paper | vsfc | mBERT | gated_multi_branch | 44.0000 | eval_accuracy | 0.9280 | 0.2195 | 0.2245 | 3.0000 | 1583 | /workspace/result_ablation_mbert_paper/vsfc/mBERT/gated_multi_branch/seed_44/all_results.json |
mbert_paper_metrics/docs/reviewer_experiment_plan.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reviewer-Focused Additional Experiments
2
+
3
+ Muc tieu cua bo thi nghiem bo sung la tra loi truc tiep ba nhan xet lap lai trong review:
4
+
5
+ 1. Lam ro mo hinh de xuat la mot pooling/classification head dat tren PLM, khong phai thay the cac MHA block ben trong Transformer.
6
+ 2. Chung minh loi ich den tu gate, khong chi den tu viec them attention pooling/head phuc tap hon.
7
+ 3. Bao cao do on dinh bang nhieu seed va do lech chuan.
8
+
9
+ ## Ablation Can Chay
10
+
11
+ Chay cung backbone, cung split, cung hyper-parameter, khac duy nhat o `--pooling_strategy`.
12
+
13
+ | Strategy | Vai tro trong paper |
14
+ | --- | --- |
15
+ | `hf_sequence_classifier` | Baseline fine-tuning chuan cua HuggingFace/PLM. |
16
+ | `cls` | Baseline pooled representation tu token dau tien voi cung MLP classifier. |
17
+ | `mean` | Masked mean pooling baseline. |
18
+ | `max` | Masked max pooling baseline. |
19
+ | `attention` | Standard attention pooling khong gate. Day la baseline reviewer yeu cau ro nhat. |
20
+ | `mha_attention` | Mot lop MHA + attention pooling, khong multi-branch va khong gate. |
21
+ | `multi_branch_average` | Cung 3 MHA branch nhu de xuat nhung tron deu, dung de tach loi ich cua gate khoi loi ich tang tham so/branch. |
22
+ | `gated_multi_branch` | Phuong phap de xuat. |
23
+
24
+ ## Lenh Chay Khuyen Nghi Tren m-gpux
25
+
26
+ Moi truong m-gpux/Modal hien co the dung Python 3.9, nen `requirements.txt` da duoc de o dang toi gian va tuong thich Python 3.9. Cai dependencies bang:
27
+
28
+ ```bash
29
+ python -m pip install -r requirements.txt
30
+ ```
31
+
32
+ Test nhanh luong truoc khi chay that:
33
+
34
+ ```bash
35
+ python scripts/run_ablation_grid.py \
36
+ --models PhoBERT \
37
+ --tasks cola \
38
+ --strategies hf_sequence_classifier attention gated_multi_branch \
39
+ --seeds 42 \
40
+ --limit 32 \
41
+ --max_runs 3
42
+ ```
43
+
44
+ Khong chay truc tiep `run_glue.py` cho bo ablation nay, vi script runner moi se tu goi `run_glue_MHA_gated.py` voi day du tham so cho tung baseline.
45
+
46
+ Neu m-gpux UI chi cho chon file va tu goi `python run_glue_MHA_gated.py` khong kem tham so, file nay da co che do no-arg launcher. Mac dinh no chay preset `core`. Co the dieu khien bang environment variables:
47
+
48
+ ```bash
49
+ ABLATION_PRESET=smoke ABLATION_DRY_RUN=1 python run_glue_MHA_gated.py
50
+ ABLATION_PRESET=smoke python run_glue_MHA_gated.py
51
+ ABLATION_PRESET=core python run_glue_MHA_gated.py
52
+ ABLATION_PRESET=full python run_glue_MHA_gated.py
53
+ ```
54
+
55
+ Bien hay dung: `ABLATION_LIMIT`, `ABLATION_MAX_RUNS`, `ABLATION_MODELS`, `ABLATION_TASKS`, `ABLATION_STRATEGIES`, `ABLATION_SEEDS`.
56
+
57
+ Lenh full ablation cho reviewer, mac dinh bat `bf16` va `tf32`:
58
+
59
+ ```bash
60
+ python scripts/run_ablation_grid.py \
61
+ --models all \
62
+ --tasks cola mrpc sst2 vnrte vsfc vsmec vtoc qqp \
63
+ --strategies hf_sequence_classifier cls mean max attention mha_attention multi_branch_average gated_multi_branch \
64
+ --seeds 42 43 44 \
65
+ --output_root result_ablation \
66
+ --epochs 3 \
67
+ --train_batch_size 32 \
68
+ --eval_batch_size 64
69
+ ```
70
+
71
+ Neu muon chay gon hon nhung van tra loi dung reviewer, chay cac baseline cot loi:
72
+
73
+ ```bash
74
+ python scripts/run_ablation_grid.py \
75
+ --models PhoBERT mDeBERTaV3 XLMR_base \
76
+ --tasks cola mrpc sst2 vnrte vsfc \
77
+ --strategies hf_sequence_classifier attention multi_branch_average gated_multi_branch \
78
+ --seeds 42 43 44
79
+ ```
80
+
81
+ Sau khi chay xong, tao bang tong hop:
82
+
83
+ ```bash
84
+ python scripts/summarize_results.py --roots result_MHA result_ablation
85
+ ```
86
+
87
+ File can lay so lieu:
88
+
89
+ - `docs/ablation_summary.md`
90
+ - `docs/ablation_results.csv`
91
+ - `docs/ablation_results_aggregate.csv`
92
+
93
+ ## Cach Dua Vao Paper
94
+
95
+ Ten goi nen sua thanh `Gated Multi-Branch Attention Pooling` hoac `GMAP` thay vi noi nhu the da thay cac MHA block ben trong PLM. Mo ta kien truc:
96
+
97
+ > We keep the pretrained Transformer backbone unchanged and attach a lightweight gated multi-branch attention pooling head on top of the final hidden states. The proposed gate dynamically combines representations produced by multiple attention branches with different head granularities.
98
+
99
+ Bang ablation nen co cac cot:
100
+
101
+ | Task | Backbone | CLS | Mean | Max | Attn pooling | MHA+Attn | Multi-branch avg | Ours |
102
+ | --- | --- | --- | --- | --- | --- | --- | --- | --- |
103
+
104
+ Trong response rebuttal, neu ket qua ung ho, viet ngan gon:
105
+
106
+ > To isolate the contribution of the gate, we added standard attention pooling and ungated multi-branch pooling baselines under the same PLM backbone and hyper-parameters. The gated variant consistently improves over attention pooling and over the ungated multi-branch average on tasks where our method previously showed the largest gains, indicating that the improvement is not merely due to adding a larger pooling head.
107
+
108
+ Neu co task giam, nen noi thang:
109
+
110
+ > The ablation also confirms that the gate is less robust on NLI-style reasoning tasks, where excessive suppression of branch-specific signals can hurt entailment decisions. We now discuss this as a limitation and avoid overclaiming language-specific universality.