ASomeoneWhoInterestedWithAI
/

LookThem_Tiny-ImageNet

Commit

b3aaca2

verified ·

1 Parent(s): 4c5f6f6

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ I was courious, what if a token look at other tokens, but without QKV? Instead,
 Then I try it for MNIST (with LookThem arch with a layer), in just few epoch, I get astonishing results. Because of that, I get deeper and try for CIFAR-10, with similar architecture. And the results is good too. Because of that, I get deeper to Tiny-ImageNet. The result is.. I don't know, the notebook's results is not for evaluation result (the AI changed the code). Maybe not 100% accuracy, but at least is can learn, with even less memory in disk (just ~5MB). That's the results for you all.
-There's many space to experimenting like deeper architecture, another activation function, etc. But without big train parameter tunes, it's reach SOTA (from scratch category).. correct me if I wrong about SOTA. So, for everyone who have bigger resources, you all can experimenting with this architecture. I train it on Google Colab's T4, and code generated by Gemini 3 Flash (except for original code).
 # Code

 Then I try it for MNIST (with LookThem arch with a layer), in just few epoch, I get astonishing results. Because of that, I get deeper and try for CIFAR-10, with similar architecture. And the results is good too. Because of that, I get deeper to Tiny-ImageNet. The result is.. I don't know, the notebook's results is not for evaluation result (the AI changed the code). Maybe not 100% accuracy, but at least is can learn, with even less memory in disk (just ~5MB). That's the results for you all.
+There's many space to experimenting like deeper architecture, another activation function, etc. But without big train parameter tunes, it's reach SOTA (for it's size).. correct me if I wrong about SOTA. So, for everyone who have bigger resources, you all can experimenting with this architecture. I train it on Google Colab's T4, and code generated by Gemini 3 Flash (except for original code).
 # Code