NamrataThakur/Small_Language_Model_MOE_127M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/Small_Language_Model_GQA_48M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/Small_Language_Model_MHA_53M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/llama31-8bn_Reinforcement-Fine-Tuned Question Answering • 8B • Updated 27 days ago • 189
NamrataThakur/llama31-8bn_Reinforcement-Fine-Tuned Question Answering • 8B • Updated 27 days ago • 189
NamrataThakur/Small_Language_Model_MOE_127M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
Stories-SLM Collection A collection of Small Language Models pretrained from scratch (using only PyTorch) on Tiny Stories Dataset on a single Tesla-T4 16GB GPU. • 3 items • Updated Mar 8 • 1
NamrataThakur/Small_Language_Model_MOE_127M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
Stories-SLM Collection A collection of Small Language Models pretrained from scratch (using only PyTorch) on Tiny Stories Dataset on a single Tesla-T4 16GB GPU. • 3 items • Updated Mar 8 • 1
NamrataThakur/Small_Language_Model_GQA_48M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/Small_Language_Model_MHA_53M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/Small_Language_Model_MHA_53M_Pretrained Text Generation • Updated 24 days ago • 2.64k • 1
NamrataThakur/llama32-1bn_FederatedLearning_Fine-Tuned_nonQuantized Question Answering • 1B • Updated Feb 24 • 3
NamrataThakur/llama32-1bn_FederatedLearning_Fine-Tuned_nonQuantized Question Answering • 1B • Updated Feb 24 • 3