| license: mit | |
| This is a custom model with 123.5M parameter. | |
| - A modified version of GPT-2 | |
| - More data will be added soon | |
| - Pretraining was done on Fineweb Edu dataset | |
| - Finetuning not done |
| license: mit | |
| This is a custom model with 123.5M parameter. | |
| - A modified version of GPT-2 | |
| - More data will be added soon | |
| - Pretraining was done on Fineweb Edu dataset | |
| - Finetuning not done |