Spaces:
Sleeping
Apply for community grant: Personal project (gpu)
I built and pretrained my own LLM entirely from scratch. This isn't a retraining of an open-source model; it's an original creation with architectural choices that are uniquely my own. While I did study open-source models for general guidance, this model is distinct and independently designed.
The model has 269 million parameters and was pretrained on 10 billion tokens of educational content. At this stage, it is a pretrained model designed primarily for sentence completion, a task it performs effectively. For instance:
Prompt: "HTML stands for"
Response: "HyperText Markup Language. For more information about the browser, see:<|endoftext|>A few months ago."
All the code and results are available here: https://github.com/LF-Luis/MyLLM/tree/main
Model hosted on Hugging Face: https://huggingface.co/LF-Luis/LF_LLM-269M/tree/main
The model achieved 35% accuracy on the HellaSwag benchmark, a promising result given the model’s scale and purpose. Additionally, its longer and more coherent document completions demonstrate its potential—examples can be found in this notebook: https://github.com/LF-Luis/MyLLM/blob/main/notebooks/hf_sampling.ipynb.
This project embodies the spirit of innovation and demonstrates that breaking a complex problem into manageable pieces can lead to substantial accomplishments. I believe it serves as a valuable inspiration for others aspiring to create in the field of deep learning.
Btw, I plan to build a small Gradio app for my LLM in HF Spaces, so would really appreciate the GPUs to make it run faster for people that want to test out my LLM's capabilities. Thanks.