Ambuj Varshney commited on
Commit
0477032
·
verified ·
1 Parent(s): ed36543

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -4,4 +4,31 @@ datasets:
4
  - HuggingFaceFW/fineweb
5
  language:
6
  - en
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - HuggingFaceFW/fineweb
5
  language:
6
  - en
7
+ ---
8
+
9
+ # TinyLLM
10
+
11
+ ## Overview
12
+
13
+ This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.
14
+ ## Model Information
15
+
16
+ - **Parameters:** 124M (Hidden Size = 768)
17
+ - **Architecture:** Decoder-only transformer
18
+ - **Training Data:** Up to 10B tokens from the [SHL](http://www.shl-dataset.org/) and [Fineweb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) datasets, combined in a 0:1 ratio
19
+ - **Input and Output Modality:** Text
20
+ - **Context Length:** 1024
21
+
22
+ ## Acknowledgements
23
+
24
+ We would like to acknowledge the open-source frameworks [llm.c](https://github.com/karpathy/llm.c) and [llama.cpp](https://github.com/ggerganov/llama.cpp), which were instrumental in training and testing these models.
25
+
26
+ ## Usage
27
+
28
+ The model can be used in two primary ways:
29
+ 1. **With Hugging Face’s Transformers Library**
30
+ 2. **With llama.cpp**
31
+
32
+ ## Disclaimer
33
+
34
+ This model is intended solely for research purposes.