| ============================================================ |
| CIFAR-10 β Dual-Stream GeoLIP ViT β EXP 4 |
| From scratch, 100 epochs, lr=0.0003 |
| CV: weight=0.1, target=0.22 |
| InfoNCE: weight=0.1 β OFF after mastery |
| Autograd: tang=1.0, sep=0.1 |
| Label smoothing: 0.1 |
| Geo classifier: ON (0.3), Geo diversity: ON (0.5) |
| Fused residual scaling: 1/β(depth+1) |
| Device: cuda |
| ============================================================ |
| Train: 50,000 (two views) Val: 10,000 |
|
|
| Building model... |
| Training from scratch |
| Parameters: 6,339,344 |
| Geo route: 2,552,764 (40.3%) |
| Std route: 3,786,580 (59.7%) |
|
|
| ============================================================ |
| TRAINING β 100 epochs, lr=0.0003, batch=256 |
| CV=0.1, autograd=ON (tang=1.0) |
| Label smoothing: 0.1 |
| Mastery: patience=50, margin 0.1β0.3 over 5000 batches |
| InfoNCE=0.1 β OFF after mastery activation |
| Geo classifier: ON (0.3), Geo diversity: ON (0.5) |
| Fused residual scaling: 1/β(depth+1) |
| Optimizer: AdamW (wd=0.01) |
| ============================================================ |
| E 1/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.44batch/s, acc=15.4%, cvf=0.0630, ga=13%, loss=1.0016, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 1: train=15.6% val=25.7% geo=13% loss=0.9965/0.2922 cv=0.3862(f=0.06160 g=0.09415) gd=0.0060 cm=100% anch=15/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) β
|
| E 2/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.43batch/s, acc=30.9%, cvf=0.0068, ga=11%, loss=0.7093, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 2: train=30.9% val=35.5% geo=11% loss=0.7064/0.2656 cv=0.3114(f=0.00678 g=0.01938) gd=0.0021 cm=100% anch=56/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) β
|
| E 3/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.42batch/s, acc=38.7%, cvf=0.0019, ga=10%, loss=0.5053, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 3: train=38.9% val=44.9% geo=10% loss=0.5039/0.2441 cv=0.2569(f=0.00190 g=0.00537) gd=0.0024 cm=100% anch=60/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) β
|
| E 4/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.13batch/s, acc=47.2%, cvf=0.0006, ga=11%, loss=0.4164, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 4: train=47.3% val=52.7% geo=11% loss=0.4156/0.2173 cv=0.2267(f=0.00058 g=0.00105) gd=0.0006 cm=100% anch=58/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 5/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.33batch/s, acc=53.1%, cvf=0.0007, ga=12%, loss=0.3712, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 5: train=53.1% val=57.0% geo=12% loss=0.3708/0.1981 cv=0.2124(f=0.00073 g=0.00067) gd=0.0003 cm=100% anch=56/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 6/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.25batch/s, acc=57.8%, cvf=0.0013, ga=19%, loss=0.3422, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 6: train=57.8% val=64.0% geo=19% loss=0.3420/0.1748 cv=0.1958(f=0.00134 g=0.00069) gd=0.0003 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 7/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.16batch/s, acc=61.3%, cvf=0.0021, ga=24%, loss=0.3192, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 7: train=61.3% val=62.6% geo=24% loss=0.3192/0.1755 cv=0.1868(f=0.00215 g=0.00077) gd=0.0002 cm=100% anch=62/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) |
| E 8/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.16batch/s, acc=63.4%, cvf=0.0026, ga=29%, loss=0.3056, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 8: train=63.4% val=65.5% geo=30% loss=0.3054/0.1617 cv=0.1648(f=0.00261 g=0.00083) gd=0.0002 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 9/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.22batch/s, acc=65.6%, cvf=0.0028, ga=34%, loss=0.2942, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 9: train=65.6% val=69.2% geo=34% loss=0.2941/0.1499 cv=0.1603(f=0.00280 g=0.00079) gd=0.0001 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 10/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.09batch/s, acc=66.9%, cvf=0.0036, ga=37%, loss=0.2867, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 10: train=66.9% val=68.3% geo=37% loss=0.2866/0.1509 cv=0.1584(f=0.00359 g=0.00079) gd=0.0001 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) |
| E 11/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.15batch/s, acc=68.4%, cvf=0.0034, ga=40%, loss=0.2796, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 11: train=68.5% val=72.4% geo=40% loss=0.2794/0.1372 cv=0.1626(f=0.00339 g=0.00073) gd=0.0001 cm=100% anch=64/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 12/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.17batch/s, acc=69.8%, cvf=0.0040, ga=43%, loss=0.2718, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 12: train=69.8% val=72.0% geo=43% loss=0.2718/0.1379 cv=0.1613(f=0.00398 g=0.00069) gd=0.0001 cm=100% anch=64/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) |
| E 13/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.45batch/s, acc=70.9%, cvf=0.0043, ga=45%, loss=0.2660, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 13: train=70.9% val=73.4% geo=45% loss=0.2659/0.1323 cv=0.1612(f=0.00435 g=0.00073) gd=0.0001 cm=100% anch=64/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) β
|
| E 14/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.47batch/s, acc=71.6%, cvf=0.0045, ga=48%, loss=0.2618, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 14: train=71.6% val=74.8% geo=48% loss=0.2617/0.1254 cv=0.1627(f=0.00442 g=0.00065) gd=0.0001 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) β
|
| E 15/100: 100%|ββββββββββ| 195/195 [00:27<00:00, 7.16batch/s, acc=73.0%, cvf=0.0047, ga=50%, loss=0.2558, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 15: train=73.0% val=75.5% geo=50% loss=0.2557/0.1240 cv=0.1557(f=0.00469 g=0.00064) gd=0.0001 cm=100% anch=64/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 16/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.32batch/s, acc=73.6%, cvf=0.0047, ga=51%, loss=0.2519, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 16: train=73.6% val=76.2% geo=52% loss=0.2517/0.1222 cv=0.1584(f=0.00474 g=0.00047) gd=0.0001 cm=100% anch=63/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 17/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.28batch/s, acc=74.3%, cvf=0.0049, ga=53%, loss=0.2485, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 17: train=74.2% val=77.3% geo=53% loss=0.2484/0.1168 cv=0.1684(f=0.00493 g=0.00071) gd=0.0001 cm=100% anch=62/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (27s) β
|
| E 18/100: 100%|ββββββββββ| 195/195 [00:26<00:00, 7.43batch/s, acc=75.2%, cvf=0.0049, ga=54%, loss=0.2445, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| E 18: train=75.1% val=76.9% geo=54% loss=0.2447/0.1175 cv=0.1493(f=0.00491 g=0.00065) gd=0.0001 cm=100% anch=64/64 [stage1] mst=0.000 mrg=0.10 hn=0.000 hp=0.000 q=0 (26s) |
| E 19/100: 40%|ββββ | 78/195 [00:10<00:15, 7.76batch/s, acc=76.2%, cvf=0.0048, ga=55%, loss=0.2398, mrg=0.10, mst=0.000, ordered=1, stg=S1] |
| β
MASTERY ACTIVATED at batch 3587 (nce_acc=1.0 for 50 consecutive) [InfoNCE OFF, margin 0.1β0.3] |
| E 19/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.78batch/s, acc=69.8%, cvf=0.0219, ga=54%, loss=0.5168, mrg=0.10, mst=0.259, ordered=1, stg=M] |
| E 19: train=69.8% val=71.8% geo=54% loss=0.5134/0.1374 cv=0.3807(f=0.02206 g=0.00064) gd=0.0001 cm=100% anch=7/64 [MASTERY] mst=0.256 mrg=0.10 hn=0.556 hp=0.361 q=4096 (25s) |
| E 20/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.90batch/s, acc=73.9%, cvf=0.0187, ga=57%, loss=0.3768, mrg=0.11, mst=0.137, ordered=1, stg=M] |
| E 20: train=74.0% val=78.5% geo=57% loss=0.3767/0.1103 cv=0.3564(f=0.01852 g=0.00040) gd=0.0001 cm=100% anch=2/64 [MASTERY] mst=0.137 mrg=0.11 hn=0.992 hp=0.963 q=4096 (25s) β
|
| E 21/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.98batch/s, acc=75.5%, cvf=0.0123, ga=58%, loss=0.3745, mrg=0.12, mst=0.142, ordered=1, stg=M] |
| E 21: train=75.5% val=79.5% geo=58% loss=0.3746/0.1073 cv=0.3543(f=0.01233 g=0.00049) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.142 mrg=0.12 hn=0.992 hp=0.967 q=4096 (24s) β
|
| E 22/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.69batch/s, acc=76.8%, cvf=0.0113, ga=58%, loss=0.3759, mrg=0.13, mst=0.147, ordered=1, stg=M] |
| E 22: train=76.8% val=77.7% geo=58% loss=0.3760/0.1133 cv=0.3140(f=0.01117 g=0.00047) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.147 mrg=0.13 hn=0.992 hp=0.970 q=4096 (25s) |
| E 23/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.72batch/s, acc=77.3%, cvf=0.0092, ga=59%, loss=0.3785, mrg=0.14, mst=0.153, ordered=1, stg=M] |
| E 23: train=77.2% val=78.6% geo=59% loss=0.3787/0.1126 cv=0.3158(f=0.00916 g=0.00049) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.153 mrg=0.14 hn=0.992 hp=0.972 q=4096 (25s) |
| E 24/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.77batch/s, acc=77.3%, cvf=0.0088, ga=59%, loss=0.3866, mrg=0.14, mst=0.161, ordered=1, stg=M] |
| E 24: train=77.3% val=78.4% geo=59% loss=0.3869/0.1107 cv=0.2961(f=0.00893 g=0.00053) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.161 mrg=0.14 hn=0.992 hp=0.972 q=4096 (25s) |
| E 25/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.94batch/s, acc=77.4%, cvf=0.0085, ga=59%, loss=0.3921, mrg=0.15, mst=0.167, ordered=1, stg=M] |
| E 25: train=77.5% val=79.5% geo=59% loss=0.3921/0.1086 cv=0.2740(f=0.00851 g=0.00052) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.168 mrg=0.15 hn=0.992 hp=0.974 q=4096 (25s) |
| E 26/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.78batch/s, acc=77.7%, cvf=0.0075, ga=59%, loss=0.3987, mrg=0.16, mst=0.175, ordered=1, stg=M] |
| E 26: train=77.8% val=80.5% geo=59% loss=0.3987/0.1028 cv=0.3288(f=0.00748 g=0.00047) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.175 mrg=0.16 hn=0.993 hp=0.973 q=4096 (25s) β
|
| E 27/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.12batch/s, acc=78.1%, cvf=0.0080, ga=59%, loss=0.4032, mrg=0.17, mst=0.181, ordered=1, stg=M] |
| E 27: train=78.1% val=78.4% geo=59% loss=0.4033/0.1095 cv=0.2901(f=0.00799 g=0.00058) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.181 mrg=0.17 hn=0.994 hp=0.976 q=4096 (24s) |
| E 28/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.98batch/s, acc=78.2%, cvf=0.0072, ga=60%, loss=0.4089, mrg=0.17, mst=0.188, ordered=1, stg=M] |
| E 28: train=78.2% val=79.4% geo=60% loss=0.4091/0.1065 cv=0.2801(f=0.00719 g=0.00050) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.188 mrg=0.17 hn=0.995 hp=0.978 q=4096 (24s) |
| E 29/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.10batch/s, acc=78.3%, cvf=0.0070, ga=60%, loss=0.4162, mrg=0.18, mst=0.196, ordered=1, stg=M] |
| E 29: train=78.3% val=80.3% geo=60% loss=0.4162/0.1038 cv=0.2960(f=0.00694 g=0.00052) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.196 mrg=0.18 hn=0.995 hp=0.978 q=4096 (24s) |
| E 30/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.09batch/s, acc=78.8%, cvf=0.0064, ga=61%, loss=0.4216, mrg=0.19, mst=0.203, ordered=1, stg=M] |
| E 30: train=78.7% val=78.9% geo=61% loss=0.4218/0.1118 cv=0.3189(f=0.00643 g=0.00048) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.203 mrg=0.19 hn=0.996 hp=0.978 q=4096 (24s) |
| E 31/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.98batch/s, acc=78.7%, cvf=0.0071, ga=61%, loss=0.4303, mrg=0.20, mst=0.211, ordered=1, stg=M] |
| E 31: train=78.6% val=78.7% geo=61% loss=0.4304/0.1089 cv=0.2933(f=0.00718 g=0.00057) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.211 mrg=0.20 hn=0.996 hp=0.978 q=4096 (24s) |
| E 32/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.93batch/s, acc=78.8%, cvf=0.0066, ga=61%, loss=0.4365, mrg=0.21, mst=0.218, ordered=1, stg=M] |
| E 32: train=78.8% val=79.0% geo=61% loss=0.4365/0.1105 cv=0.3057(f=0.00660 g=0.00050) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.218 mrg=0.21 hn=0.996 hp=0.979 q=4096 (25s) |
| E 33/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.08batch/s, acc=79.5%, cvf=0.0064, ga=62%, loss=0.4403, mrg=0.21, mst=0.225, ordered=1, stg=M] |
| E 33: train=79.5% val=79.4% geo=62% loss=0.4405/0.1082 cv=0.3318(f=0.00628 g=0.00051) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.225 mrg=0.21 hn=0.996 hp=0.980 q=4096 (24s) |
| E 34/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.83batch/s, acc=79.5%, cvf=0.0070, ga=62%, loss=0.4486, mrg=0.22, mst=0.234, ordered=1, stg=M] |
| E 34: train=79.5% val=80.8% geo=62% loss=0.4486/0.1009 cv=0.3150(f=0.00700 g=0.00055) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.234 mrg=0.22 hn=0.996 hp=0.979 q=4096 (25s) β
|
| E 35/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.09batch/s, acc=79.8%, cvf=0.0067, ga=62%, loss=0.4545, mrg=0.23, mst=0.241, ordered=1, stg=M] |
| E 35: train=79.8% val=81.4% geo=62% loss=0.4546/0.1000 cv=0.2936(f=0.00666 g=0.00046) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.241 mrg=0.23 hn=0.996 hp=0.979 q=4096 (24s) β
|
| E 36/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.94batch/s, acc=80.1%, cvf=0.0056, ga=63%, loss=0.4603, mrg=0.24, mst=0.248, ordered=1, stg=M] |
| E 36: train=80.0% val=81.0% geo=63% loss=0.4604/0.1015 cv=0.3145(f=0.00552 g=0.00038) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.248 mrg=0.24 hn=0.996 hp=0.981 q=4096 (25s) |
| E 37/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.90batch/s, acc=80.7%, cvf=0.0065, ga=63%, loss=0.4660, mrg=0.24, mst=0.255, ordered=1, stg=M] |
| E 37: train=80.7% val=80.9% geo=63% loss=0.4662/0.1018 cv=0.2868(f=0.00645 g=0.00048) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.255 mrg=0.25 hn=0.996 hp=0.981 q=4096 (25s) |
| E 38/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.08batch/s, acc=80.9%, cvf=0.0060, ga=63%, loss=0.4730, mrg=0.25, mst=0.263, ordered=1, stg=M] |
| E 38: train=80.9% val=81.5% geo=63% loss=0.4730/0.1002 cv=0.3044(f=0.00590 g=0.00043) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.263 mrg=0.25 hn=0.996 hp=0.981 q=4096 (24s) β
|
| E 39/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.81batch/s, acc=81.0%, cvf=0.0060, ga=64%, loss=0.4792, mrg=0.26, mst=0.271, ordered=1, stg=M] |
| E 39: train=80.9% val=81.1% geo=64% loss=0.4794/0.1030 cv=0.3079(f=0.00589 g=0.00047) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.271 mrg=0.26 hn=0.996 hp=0.981 q=4096 (25s) |
| E 40/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=81.1%, cvf=0.0054, ga=64%, loss=0.4879, mrg=0.27, mst=0.279, ordered=1, stg=M] |
| E 40: train=81.1% val=81.7% geo=64% loss=0.4877/0.0999 cv=0.2930(f=0.00540 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.280 mrg=0.27 hn=0.996 hp=0.981 q=4096 (25s) β
|
| E 41/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.97batch/s, acc=81.5%, cvf=0.0057, ga=64%, loss=0.4933, mrg=0.28, mst=0.287, ordered=1, stg=M] |
| E 41: train=81.5% val=81.8% geo=64% loss=0.4934/0.1010 cv=0.3241(f=0.00565 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.287 mrg=0.28 hn=0.996 hp=0.981 q=4096 (24s) β
|
| E 42/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.86batch/s, acc=81.6%, cvf=0.0055, ga=64%, loss=0.4990, mrg=0.28, mst=0.293, ordered=1, stg=M] |
| E 42: train=81.6% val=81.4% geo=64% loss=0.4991/0.0996 cv=0.2825(f=0.00547 g=0.00049) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.293 mrg=0.28 hn=0.996 hp=0.983 q=4096 (25s) |
| E 43/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=82.1%, cvf=0.0050, ga=65%, loss=0.5048, mrg=0.29, mst=0.301, ordered=1, stg=M] |
| E 43: train=82.1% val=82.4% geo=64% loss=0.5051/0.0962 cv=0.2843(f=0.00500 g=0.00047) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.301 mrg=0.29 hn=0.996 hp=0.982 q=4096 (25s) β
|
| E 44/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.83batch/s, acc=82.5%, cvf=0.0050, ga=65%, loss=0.5111, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 44: train=82.5% val=83.2% geo=65% loss=0.5111/0.0920 cv=0.3112(f=0.00500 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.982 q=4096 (25s) β
|
| E 45/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.79batch/s, acc=82.9%, cvf=0.0047, ga=65%, loss=0.5144, mrg=0.30, mst=0.314, ordered=1, stg=M] |
| E 45: train=82.9% val=82.4% geo=65% loss=0.5144/0.0950 cv=0.2718(f=0.00468 g=0.00048) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.314 mrg=0.30 hn=0.996 hp=0.983 q=4096 (25s) |
| E 46/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.73batch/s, acc=82.9%, cvf=0.0047, ga=66%, loss=0.5132, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 46: train=82.9% val=82.5% geo=66% loss=0.5131/0.0972 cv=0.2962(f=0.00471 g=0.00043) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.983 q=4096 (25s) |
| E 47/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=83.6%, cvf=0.0044, ga=66%, loss=0.5106, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 47: train=83.6% val=82.9% geo=66% loss=0.5107/0.0947 cv=0.2938(f=0.00444 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 48/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.72batch/s, acc=84.1%, cvf=0.0046, ga=66%, loss=0.5086, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 48: train=84.1% val=83.2% geo=66% loss=0.5086/0.0912 cv=0.2900(f=0.00462 g=0.00050) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) β
|
| E 49/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.75batch/s, acc=84.3%, cvf=0.0045, ga=67%, loss=0.5077, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 49: train=84.2% val=82.7% geo=67% loss=0.5079/0.0960 cv=0.2604(f=0.00447 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.983 q=4096 (25s) |
| E 50/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.80batch/s, acc=84.7%, cvf=0.0043, ga=67%, loss=0.5059, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 50: train=84.7% val=84.1% geo=67% loss=0.5059/0.0899 cv=0.2509(f=0.00430 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) β
|
| E 51/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.14batch/s, acc=84.7%, cvf=0.0043, ga=67%, loss=0.5051, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 51: train=84.7% val=83.5% geo=67% loss=0.5051/0.0925 cv=0.2687(f=0.00431 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) |
| E 52/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.13batch/s, acc=85.0%, cvf=0.0042, ga=67%, loss=0.5046, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 52: train=85.0% val=84.3% geo=67% loss=0.5047/0.0893 cv=0.2658(f=0.00416 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) β
|
| E 53/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.00batch/s, acc=85.4%, cvf=0.0046, ga=68%, loss=0.5029, mrg=0.30, mst=0.313, ordered=1, stg=M] |
| E 53: train=85.4% val=82.9% geo=68% loss=0.5028/0.0920 cv=0.2606(f=0.00451 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.313 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) |
| E 54/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.68batch/s, acc=85.9%, cvf=0.0039, ga=68%, loss=0.5004, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 54: train=85.9% val=84.6% geo=68% loss=0.5005/0.0865 cv=0.2609(f=0.00391 g=0.00049) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) β
|
| E 55/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.07batch/s, acc=86.2%, cvf=0.0038, ga=68%, loss=0.4992, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 55: train=86.2% val=84.2% geo=68% loss=0.4993/0.0882 cv=0.2849(f=0.00378 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) |
| E 56/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.72batch/s, acc=86.5%, cvf=0.0040, ga=68%, loss=0.4981, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 56: train=86.5% val=85.3% geo=68% loss=0.4979/0.0838 cv=0.2565(f=0.00398 g=0.00043) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) β
|
| E 57/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.96batch/s, acc=87.1%, cvf=0.0039, ga=69%, loss=0.4951, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 57: train=87.1% val=84.5% geo=69% loss=0.4953/0.0870 cv=0.2629(f=0.00390 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) |
| E 58/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.73batch/s, acc=86.9%, cvf=0.0037, ga=69%, loss=0.4954, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 58: train=86.8% val=83.8% geo=69% loss=0.4956/0.0924 cv=0.2771(f=0.00369 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 59/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.77batch/s, acc=87.5%, cvf=0.0037, ga=69%, loss=0.4938, mrg=0.30, mst=0.312, ordered=1, stg=M] |
| E 59: train=87.5% val=85.3% geo=69% loss=0.4937/0.0846 cv=0.2759(f=0.00379 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.312 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 60/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.15batch/s, acc=87.6%, cvf=0.0033, ga=69%, loss=0.4924, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 60: train=87.6% val=85.3% geo=69% loss=0.4923/0.0843 cv=0.2711(f=0.00331 g=0.00047) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (24s) β
|
| E 61/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.73batch/s, acc=87.8%, cvf=0.0035, ga=70%, loss=0.4906, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 61: train=87.8% val=85.0% geo=70% loss=0.4906/0.0849 cv=0.2611(f=0.00354 g=0.00043) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 62/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.90batch/s, acc=88.3%, cvf=0.0031, ga=70%, loss=0.4887, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 62: train=88.3% val=85.1% geo=70% loss=0.4888/0.0866 cv=0.2625(f=0.00311 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 63/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.95batch/s, acc=88.6%, cvf=0.0032, ga=70%, loss=0.4882, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 63: train=88.6% val=85.1% geo=70% loss=0.4880/0.0845 cv=0.2438(f=0.00323 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 64/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=88.7%, cvf=0.0034, ga=70%, loss=0.4868, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 64: train=88.7% val=85.8% geo=70% loss=0.4868/0.0822 cv=0.2398(f=0.00334 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) β
|
| E 65/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=89.2%, cvf=0.0033, ga=70%, loss=0.4852, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 65: train=89.2% val=85.5% geo=70% loss=0.4852/0.0828 cv=0.2795(f=0.00329 g=0.00048) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 66/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.77batch/s, acc=89.5%, cvf=0.0034, ga=71%, loss=0.4843, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 66: train=89.4% val=84.7% geo=71% loss=0.4845/0.0867 cv=0.2425(f=0.00337 g=0.00038) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.985 q=4096 (25s) |
| E 67/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=89.8%, cvf=0.0031, ga=71%, loss=0.4829, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 67: train=89.8% val=85.3% geo=71% loss=0.4829/0.0839 cv=0.2650(f=0.00310 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.984 q=4096 (25s) |
| E 68/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.87batch/s, acc=89.7%, cvf=0.0027, ga=71%, loss=0.4836, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 68: train=89.7% val=85.6% geo=71% loss=0.4836/0.0839 cv=0.2679(f=0.00269 g=0.00039) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.995 hp=0.984 q=4096 (25s) |
| E 69/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.13batch/s, acc=90.3%, cvf=0.0029, ga=71%, loss=0.4807, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 69: train=90.2% val=86.2% geo=71% loss=0.4808/0.0814 cv=0.2527(f=0.00286 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.985 q=4096 (24s) β
|
| E 70/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.78batch/s, acc=90.5%, cvf=0.0030, ga=71%, loss=0.4804, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 70: train=90.5% val=85.7% geo=71% loss=0.4804/0.0827 cv=0.2801(f=0.00292 g=0.00048) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.985 q=4096 (25s) |
| E 71/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.72batch/s, acc=90.7%, cvf=0.0028, ga=72%, loss=0.4790, mrg=0.30, mst=0.311, ordered=1, stg=M] |
| E 71: train=90.7% val=86.2% geo=71% loss=0.4790/0.0811 cv=0.2769(f=0.00281 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.311 mrg=0.30 hn=0.996 hp=0.985 q=4096 (25s) β
|
| E 72/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.75batch/s, acc=91.1%, cvf=0.0027, ga=71%, loss=0.4779, mrg=0.30, mst=0.310, ordered=1, stg=M] |
| E 72: train=91.1% val=86.2% geo=71% loss=0.4781/0.0821 cv=0.2526(f=0.00275 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.310 mrg=0.30 hn=0.996 hp=0.986 q=4096 (25s) |
| E 73/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.84batch/s, acc=91.4%, cvf=0.0028, ga=72%, loss=0.4758, mrg=0.30, mst=0.310, ordered=1, stg=M] |
| E 73: train=91.4% val=86.1% geo=72% loss=0.4759/0.0805 cv=0.2502(f=0.00286 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.310 mrg=0.30 hn=0.996 hp=0.987 q=4096 (25s) |
| E 74/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.75batch/s, acc=91.4%, cvf=0.0029, ga=72%, loss=0.4753, mrg=0.30, mst=0.310, ordered=1, stg=M] |
| E 74: train=91.5% val=86.7% geo=72% loss=0.4752/0.0797 cv=0.2535(f=0.00289 g=0.00038) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.310 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) β
|
| E 75/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.96batch/s, acc=91.7%, cvf=0.0026, ga=72%, loss=0.4736, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 75: train=91.7% val=86.3% geo=72% loss=0.4736/0.0808 cv=0.2491(f=0.00267 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 76/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.83batch/s, acc=91.8%, cvf=0.0026, ga=72%, loss=0.4726, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 76: train=91.8% val=86.3% geo=72% loss=0.4727/0.0810 cv=0.2525(f=0.00262 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 77/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=91.9%, cvf=0.0028, ga=72%, loss=0.4722, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 77: train=91.9% val=86.9% geo=72% loss=0.4721/0.0800 cv=0.2717(f=0.00285 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) β
|
| E 78/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.86batch/s, acc=92.3%, cvf=0.0027, ga=72%, loss=0.4705, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 78: train=92.3% val=86.4% geo=72% loss=0.4706/0.0813 cv=0.2480(f=0.00264 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 79/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.02batch/s, acc=92.6%, cvf=0.0025, ga=72%, loss=0.4693, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 79: train=92.6% val=86.8% geo=72% loss=0.4694/0.0799 cv=0.2268(f=0.00258 g=0.00046) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 80/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=92.6%, cvf=0.0028, ga=72%, loss=0.4692, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 80: train=92.6% val=86.5% geo=72% loss=0.4692/0.0806 cv=0.2503(f=0.00283 g=0.00039) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 81/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.71batch/s, acc=93.0%, cvf=0.0028, ga=73%, loss=0.4678, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 81: train=93.0% val=86.4% geo=73% loss=0.4678/0.0801 cv=0.2661(f=0.00277 g=0.00036) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 82/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.80batch/s, acc=93.0%, cvf=0.0024, ga=73%, loss=0.4676, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 82: train=93.0% val=86.8% geo=73% loss=0.4675/0.0793 cv=0.2440(f=0.00248 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 83/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.10batch/s, acc=93.1%, cvf=0.0025, ga=73%, loss=0.4673, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 83: train=93.1% val=86.7% geo=73% loss=0.4673/0.0797 cv=0.2585(f=0.00246 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 84/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.11batch/s, acc=93.2%, cvf=0.0024, ga=73%, loss=0.4671, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 84: train=93.2% val=86.6% geo=73% loss=0.4670/0.0798 cv=0.2067(f=0.00247 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 85/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.15batch/s, acc=93.5%, cvf=0.0024, ga=73%, loss=0.4660, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 85: train=93.5% val=86.8% geo=73% loss=0.4660/0.0790 cv=0.2276(f=0.00236 g=0.00042) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 86/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.93batch/s, acc=93.7%, cvf=0.0028, ga=73%, loss=0.4651, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 86: train=93.7% val=86.7% geo=73% loss=0.4650/0.0795 cv=0.2354(f=0.00279 g=0.00039) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 87/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=93.8%, cvf=0.0029, ga=73%, loss=0.4648, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 87: train=93.8% val=86.9% geo=73% loss=0.4648/0.0797 cv=0.2437(f=0.00287 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) β
|
| E 88/100: 100%|ββββββββββ| 195/195 [00:23<00:00, 8.14batch/s, acc=93.9%, cvf=0.0023, ga=73%, loss=0.4644, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 88: train=93.9% val=87.0% geo=73% loss=0.4643/0.0797 cv=0.2342(f=0.00225 g=0.00044) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) β
|
| E 89/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.02batch/s, acc=93.9%, cvf=0.0024, ga=74%, loss=0.4640, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 89: train=93.9% val=87.0% geo=74% loss=0.4639/0.0793 cv=0.2337(f=0.00239 g=0.00035) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 90/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=93.9%, cvf=0.0026, ga=73%, loss=0.4641, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 90: train=93.9% val=87.2% geo=73% loss=0.4640/0.0788 cv=0.2491(f=0.00261 g=0.00046) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) β
|
| E 91/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.75batch/s, acc=94.2%, cvf=0.0026, ga=73%, loss=0.4633, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 91: train=94.2% val=87.0% geo=73% loss=0.4633/0.0787 cv=0.2172(f=0.00254 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 92/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 8.11batch/s, acc=94.3%, cvf=0.0025, ga=73%, loss=0.4629, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 92: train=94.3% val=86.8% geo=73% loss=0.4629/0.0798 cv=0.2159(f=0.00251 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (24s) |
| E 93/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.75batch/s, acc=94.3%, cvf=0.0025, ga=73%, loss=0.4628, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 93: train=94.3% val=86.9% geo=73% loss=0.4628/0.0790 cv=0.2514(f=0.00248 g=0.00040) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 94/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.92batch/s, acc=94.3%, cvf=0.0027, ga=73%, loss=0.4627, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 94: train=94.3% val=87.0% geo=73% loss=0.4628/0.0785 cv=0.2340(f=0.00266 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 95/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.77batch/s, acc=94.4%, cvf=0.0028, ga=74%, loss=0.4622, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 95: train=94.4% val=87.1% geo=74% loss=0.4621/0.0791 cv=0.2106(f=0.00275 g=0.00037) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 96/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.76batch/s, acc=94.5%, cvf=0.0023, ga=74%, loss=0.4624, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 96: train=94.5% val=87.2% geo=74% loss=0.4624/0.0788 cv=0.2395(f=0.00236 g=0.00041) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) β
|
| E 97/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.72batch/s, acc=94.5%, cvf=0.0024, ga=74%, loss=0.4622, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 97: train=94.5% val=87.0% geo=74% loss=0.4622/0.0791 cv=0.2422(f=0.00238 g=0.00031) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 98/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.66batch/s, acc=94.5%, cvf=0.0025, ga=73%, loss=0.4620, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 98: train=94.5% val=87.1% geo=73% loss=0.4621/0.0789 cv=0.2197(f=0.00252 g=0.00045) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E 99/100: 100%|ββββββββββ| 195/195 [00:25<00:00, 7.74batch/s, acc=94.5%, cvf=0.0027, ga=74%, loss=0.4622, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E 99: train=94.4% val=87.0% geo=74% loss=0.4623/0.0788 cv=0.2077(f=0.00275 g=0.00050) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
| E100/100: 100%|ββββββββββ| 195/195 [00:24<00:00, 7.93batch/s, acc=94.4%, cvf=0.0025, ga=74%, loss=0.4623, mrg=0.30, mst=0.309, ordered=1, stg=M] |
| E100: train=94.4% val=87.2% geo=74% loss=0.4622/0.0785 cv=0.2294(f=0.00245 g=0.00043) gd=0.0001 cm=100% anch=1/64 [MASTERY] mst=0.309 mrg=0.30 hn=0.996 hp=0.988 q=4096 (25s) |
|
|
| Best val accuracy: 87.2% |
|
|
| ============================================================ |
| DONE |
| ============================================================ |