LH-Tech-AI
/

htmLLM-124M

Text Generation

Model card Files Files and versions

LH-Tech-AI commited on Mar 13

Commit

c5e469a

·

verified ·

1 Parent(s): 8e195fa

Update README.md

Files changed (1) hide show

README.md +35 -3

README.md CHANGED Viewed

@@ -1,3 +1,35 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- bigcode/the-stack-smol
+- ttbui/html_alpaca
+language:
+- en
+tags:
+- code
+- coding
+- small
+- tiny
+---
+# Welcome to htmLLM v2 124M!
+With this LLM, we wanted to see, how well tiny LLMs with just 124 million parameters can perform on coding tasks.
+This model is also a bit finetuned using html_alpaca directly in the pretraining.
+If you want to try it, you can use htmllm.ipynb in the HF model files and download the model weight from this HF model.
+# Code
+All code can be accessed via the file **htmllm_v2_124m.ipynb** in this HF model.
+# Weights
+The final **base** model checkpoint can be downloaded here in the files list as **ckpt.pt**. It will be available soon!
+# Training
+We trained our model on a single Kaggle T4 GPU.
+# Thanks to:
+- Andrej Karpathy and his nanoGPT code
+- Kaggle for the free GPU hours for training on the T4
+- You all for your support on my reddit.