| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - bigcode/the-stack-smol |
| | language: |
| | - en |
| | tags: |
| | - code |
| | - base |
| | - coding |
| | - small |
| | - tiny |
| | --- |
| | |
| | # Welcome to htmLLM 50M Base! |
| |
|
| | With this LLM, we wanted to see, how well tiny LLMs with just 50 million parameters can perform on coding tasks. |
| |
|
| | If you want to try it, you can use htmllm.ipynb in the HF model files and download the model weight from this HF model. |
| |
|
| | # Code |
| | All code can be accessed via the file **htmllm.ipynb** in this HF model. |
| |
|
| | # Weights |
| | The final **base** model checkpoint can be downloaded here in the files list as **ckpt.pt**. |
| |
|
| | # Training |
| | We trained our model on a single Kaggle T4 GPU. |
| |
|
| | # Thanks to: |
| | - Andrej Karpathy and his nanoGPT code |
| | - Kaggle for the free GPU hours for training on the T4 |
| | - You all for your support on my reddit. |