htmLLM-124M / README.md
LH-Tech-AI's picture
Update README.md
eac7da3 verified
---
license: apache-2.0
datasets:
- bigcode/the-stack-smol
- ttbui/html_alpaca
language:
- en
tags:
- code
- coding
- small
- tiny
---
# Welcome to htmLLM v2 124M!
With this LLM, we wanted to see, how well tiny LLMs with just 124 million parameters can perform on coding tasks.
This model is also a bit finetuned using html_alpaca directly in the pretraining.
If you want to try it, you can use htmllm.ipynb in the HF model files and download the model weight from this HF model.
# Code
All code can be accessed via the file **htmllm_v2_124m.ipynb** in this HF model.
# Weights
The final **base** model checkpoint can be downloaded here in the files list as **ckpt.pt**.
# Training
We trained our model on a single Kaggle T4 GPU.
# Thanks to:
- Andrej Karpathy and his nanoGPT code
- Kaggle for the free GPU hours for training on the T4
- You all for your support on my reddit.