checkpoints

This model is a fine-tuned version of on the songlab/gpn-msa-sapiens-dataset dataset. It achieves the following results on the evaluation set:

Loss: 0.1624

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6255	0.0232	50	0.2028
0.1532	0.0464	100	0.1865
0.1448	0.0695	150	0.1812
0.1431	0.0927	200	0.1786
0.1423	0.1159	250	0.1769
0.1413	0.1391	300	0.1753
0.1391	0.1623	350	0.1754
0.1398	0.1854	400	0.1761
0.1398	0.2086	450	0.1732
0.1384	0.2318	500	0.1733
0.1386	0.2550	550	0.1720
0.1377	0.2782	600	0.1741
0.1372	0.3013	650	0.1739
0.1374	0.3245	700	0.1725
0.1378	0.3477	750	0.1731
0.1371	0.3709	800	0.1706
0.1364	0.3941	850	0.1731
0.1382	0.4172	900	0.1730
0.1369	0.4404	950	0.1730
0.1373	0.4636	1000	0.1722
0.1357	0.4868	1050	0.1730
0.135	0.5100	1100	0.1715
0.1357	0.5331	1150	0.1702
0.1366	0.5563	1200	0.1708
0.1352	0.5795	1250	0.1722
0.1366	0.6027	1300	0.1698
0.1353	0.6259	1350	0.1702
0.1363	0.6490	1400	0.1706
0.1362	0.6722	1450	0.1689
0.1345	0.6954	1500	0.1679
0.1355	0.7186	1550	0.1690
0.135	0.7418	1600	0.1681
0.1348	0.7650	1650	0.1672
0.1343	0.7881	1700	0.1673
0.1342	0.8113	1750	0.1693
0.1334	0.8345	1800	0.1665
0.136	0.8577	1850	0.1671
0.1349	0.8809	1900	0.1690
0.1345	0.9040	1950	0.1672
0.1332	0.9272	2000	0.1669
0.1349	0.9504	2050	0.1685
0.1355	0.9736	2100	0.1678
0.134	0.9968	2150	0.1670
0.1345	1.0199	2200	0.1672
0.1345	1.0431	2250	0.1681
0.1339	1.0663	2300	0.1662
0.1333	1.0895	2350	0.1674
0.1336	1.1127	2400	0.1651
0.1335	1.1358	2450	0.1657
0.1335	1.1590	2500	0.1671
0.1322	1.1822	2550	0.1655
0.1333	1.2054	2600	0.1664
0.1325	1.2286	2650	0.1660
0.1334	1.2517	2700	0.1654
0.1326	1.2749	2750	0.1657
0.1323	1.2981	2800	0.1658
0.1324	1.3213	2850	0.1662
0.1327	1.3445	2900	0.1664
0.1324	1.3676	2950	0.1670
0.1325	1.3908	3000	0.1659
0.1318	1.4140	3050	0.1639
0.1328	1.4372	3100	0.1665
0.1332	1.4604	3150	0.1676
0.1326	1.4835	3200	0.1649
0.1327	1.5067	3250	0.1664
0.1332	1.5299	3300	0.1658
0.1321	1.5531	3350	0.1662
0.1324	1.5763	3400	0.1642
0.1317	1.5994	3450	0.1658
0.1326	1.6226	3500	0.1651
0.1328	1.6458	3550	0.1651
0.1327	1.6690	3600	0.1654
0.1314	1.6922	3650	0.1639
0.1319	1.7153	3700	0.1658
0.1308	1.7385	3750	0.1657
0.1322	1.7617	3800	0.1649
0.1315	1.7849	3850	0.1649
0.1316	1.8081	3900	0.1639
0.1312	1.8312	3950	0.1659
0.1323	1.8544	4000	0.1648
0.132	1.8776	4050	0.1641
0.1308	1.9008	4100	0.1647
0.1323	1.9240	4150	0.1644
0.131	1.9471	4200	0.1631
0.1313	1.9703	4250	0.1632
0.1309	1.9935	4300	0.1644
0.1319	2.0167	4350	0.1643
0.1316	2.0399	4400	0.1641
0.1312	2.0631	4450	0.1633
0.1313	2.0862	4500	0.1646
0.1316	2.1094	4550	0.1629
0.1306	2.1326	4600	0.1644
0.1313	2.1558	4650	0.1636
0.131	2.1790	4700	0.1635
0.1311	2.2021	4750	0.1637
0.1315	2.2253	4800	0.1641
0.1314	2.2485	4850	0.1641
0.1306	2.2717	4900	0.1636
0.1314	2.2949	4950	0.1635
0.1321	2.3180	5000	0.1628
0.13	2.3412	5050	0.1635
0.1315	2.3644	5100	0.1641
0.1311	2.3876	5150	0.1638
0.1304	2.4108	5200	0.1647
0.131	2.4339	5250	0.1624
0.131	2.4571	5300	0.1633
0.1308	2.4803	5350	0.1639
0.1314	2.5035	5400	0.1643
0.1307	2.5267	5450	0.1631
0.1314	2.5498	5500	0.1635
0.1303	2.5730	5550	0.1635
0.1304	2.5962	5600	0.1631
0.1298	2.6194	5650	0.1623
0.1305	2.6426	5700	0.1627
0.1308	2.6657	5750	0.1624
0.1306	2.6889	5800	0.1639
0.1303	2.7121	5850	0.1633
0.131	2.7353	5900	0.1625
0.1305	2.7585	5950	0.1637
0.1307	2.7816	6000	0.1623
0.1296	2.8048	6050	0.1633
0.1312	2.8280	6100	0.1632
0.1305	2.8512	6150	0.1640
0.1303	2.8744	6200	0.1632
0.1302	2.8975	6250	0.1621
0.1298	2.9207	6300	0.1624
0.1313	2.9439	6350	0.1632
0.1304	2.9671	6400	0.1624
0.13	2.9903	6450	0.1637
0.1313	3.0134	6500	0.1630
0.131	3.0366	6550	0.1612
0.1311	3.0598	6600	0.1621
0.1301	3.0830	6650	0.1619
0.1298	3.1062	6700	0.1622
0.1302	3.1293	6750	0.1609
0.1311	3.1525	6800	0.1623
0.1304	3.1757	6850	0.1625
0.1306	3.1989	6900	0.1615
0.1302	3.2221	6950	0.1622
0.1307	3.2452	7000	0.1623
0.1302	3.2684	7050	0.1626
0.1296	3.2916	7100	0.1622
0.1321	3.3148	7150	0.1622
0.1307	3.3380	7200	0.1628
0.1301	3.3611	7250	0.1625
0.1307	3.3843	7300	0.1632
0.1305	3.4075	7350	0.1627
0.1303	3.4307	7400	0.1626
0.13	3.4539	7450	0.1625
0.1288	3.4771	7500	0.1633
0.1301	3.5002	7550	0.1618
0.1304	3.5234	7600	0.1620
0.1303	3.5466	7650	0.1621
0.1308	3.5698	7700	0.1630
0.1297	3.5930	7750	0.1621
0.13	3.6161	7800	0.1618
0.1305	3.6393	7850	0.1628
0.1314	3.6625	7900	0.1624
0.1299	3.6857	7950	0.1616
0.1294	3.7089	8000	0.1610
0.1293	3.7320	8050	0.1618
0.1303	3.7552	8100	0.1616
0.1318	3.7784	8150	0.1621
0.1295	3.8016	8200	0.1613
0.1309	3.8248	8250	0.1620
0.1288	3.8479	8300	0.1615
0.1296	3.8711	8350	0.1623
0.1302	3.8943	8400	0.1621
0.1303	3.9175	8450	0.1625
0.1296	3.9407	8500	0.1624
0.1302	3.9638	8550	0.1615
0.1311	3.9870	8600	0.1618
0.13	4.0102	8650	0.1617
0.1299	4.0334	8700	0.1623
0.1302	4.0566	8750	0.1627
0.1302	4.0797	8800	0.1612
0.1308	4.1029	8850	0.1627
0.1298	4.1261	8900	0.1633
0.1297	4.1493	8950	0.1626
0.1298	4.1725	9000	0.1617
0.1304	4.1956	9050	0.1623
0.1302	4.2188	9100	0.1616
0.1298	4.2420	9150	0.1626
0.1294	4.2652	9200	0.1625
0.1308	4.2884	9250	0.1621
0.1302	4.3115	9300	0.1624
0.1311	4.3347	9350	0.1613
0.1304	4.3579	9400	0.1629
0.13	4.3811	9450	0.1635
0.1295	4.4043	9500	0.1616
0.1305	4.4274	9550	0.1617
0.13	4.4506	9600	0.1630
0.1297	4.4738	9650	0.1613
0.13	4.4970	9700	0.1622
0.1302	4.5202	9750	0.1617
0.1305	4.5433	9800	0.1622
0.1299	4.5665	9850	0.1631
0.1299	4.5897	9900	0.1618
0.1307	4.6129	9950	0.1629
0.1293	4.6361	10000	0.1616

Framework versions

Transformers 4.40.2
Pytorch 2.8.0+cu126
Datasets 4.0.0
Tokenizers 0.19.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

pl593
/

gpn-msa-model-h1

checkpoints

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train pl593/gpn-msa-model-h1