| --- |
| datasets: |
| - HuggingFaceFW/fineweb-edu |
| language: |
| - en |
| pipeline_tag: text-generation |
| tags: |
| - base |
| - small |
| - cpu |
| - open-source |
| - open |
| - spark |
| - lh-tech |
| - llm |
| - llama |
| - tiny |
| --- |
| |
| # ✨ Spark v4 |
| Today, we are introducing Spark v4, a 5M parameter Llama base model trained on 0.7B tokens of the Sample-10BT of Fineweb-Edu. |
|
|
| ## Results |
| - Final Loss / Val Loss: ~3.1 / 3.108 |
| - Output quality: 5/10 |
| - PIQA: 0.5593 |
| - LAMBADA (PPL): 588.26 |
| - HellaSwag: 0.2695 |
|
|
| More information about the Spark Sub-5M research: https://lh-tech.de/ai/sub-5m-research.html |
|
|
| ## Usage |
| You can use the model with `use.py`. But first you'll need to download `config.json`, `tokenizer.json`, `tokenizer_config.json`, `model.safetensors` and `generation_config.json` and put these files into a subfolder `spark_v4_fp16_final`. |
|
|
| ## Example output |
| **Input:** The main concept of physics<br> |
| **Output:** *is that it gives some unlimited means to think about the universe. It helps us not only to think about how the universe is created but also how we think about the universe. In this way, an inner universe can be made to our own universe. This is because it is not a matter of fact and that is the object of the universe. It can take a lot of time to understand how it is created and why it must be made. |
| In the first place, the Universe is a complex and interesting part of it. It can be a kind of a real, creative, and universal part of our universe. It can be just that the universe was created. It could be a kind of universe. It can be a kind of kind of complex concept. That could be something that does something that really needs to be a kind of universe, or something that* |
|
|
| ## Training code |
| The full training code can be found as `train.ipynb` in this repo. |
|
|
| ## Training specs |
| - GPU: 2x T4 on Kaggle |
| - Time: ~4 hours |
| - more detailed info in the notebook :-) |
|
|
| ## Have fun :D |