LH-Tech-AI
/

htmLLM-124M

Model card Files Files and versions

htmLLM-124M / README.md

LH-Tech-AI's picture

Update README.md

eac7da3 verified about 20 hours ago

|

history blame contribute delete

885 Bytes

	---
	license: apache-2.0
	datasets:
	- bigcode/the-stack-smol
	- ttbui/html_alpaca
	language:
	- en
	tags:
	- code
	- coding
	- small
	- tiny
	---

	# Welcome to htmLLM v2 124M!

	With this LLM, we wanted to see, how well tiny LLMs with just 124 million parameters can perform on coding tasks.

	This model is also a bit finetuned using html_alpaca directly in the pretraining.

	If you want to try it, you can use htmllm.ipynb in the HF model files and download the model weight from this HF model.

	# Code
	All code can be accessed via the file htmllm_v2_124m.ipynb in this HF model.

	# Weights
	The final base model checkpoint can be downloaded here in the files list as ckpt.pt.

	# Training
	We trained our model on a single Kaggle T4 GPU.

	# Thanks to:
	- Andrej Karpathy and his nanoGPT code
	- Kaggle for the free GPU hours for training on the T4
	- You all for your support on my reddit.