Ashx098 commited on
Commit
87879d6
·
verified ·
1 Parent(s): f155ce7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +23 -1
README.md CHANGED
@@ -1,3 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # 🧠 Mini-LLM — 80M Parameter Transformer (Pretrained From Scratch)
2
 
3
  <p align="center">
@@ -85,7 +107,7 @@ print(tok.decode(outputs[0], skip_special_tokens=True))
85
  - Trained on 1× NVIDIA A100 80GB
86
 
87
  ## 📊 Training Curve
88
- <p align="center"> <img src="phase-1-pretraining/plots/loss_curve.png" width="500"> </p>
89
 
90
  Final loss reached: ~3.25
91
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - llm
7
+ - decoder-only
8
+ - transformer
9
+ - from-scratch
10
+ - research
11
+ - educational
12
+ - 80m
13
+ - pytorch
14
+ - pretraining
15
+ - custom-architecture
16
+ pipeline_tag: text-generation
17
+ inference:
18
+ parameters:
19
+ temperature: 0.7
20
+ top_p: 0.95
21
+ ---
22
+
23
  # 🧠 Mini-LLM — 80M Parameter Transformer (Pretrained From Scratch)
24
 
25
  <p align="center">
 
107
  - Trained on 1× NVIDIA A100 80GB
108
 
109
  ## 📊 Training Curve
110
+ <p align="center"> <img src="https://huggingface.co/Ashx098/Mini-LLM/resolve/main/phase-1-pretraining/plots/loss_curve.png" width="500"> </p>
111
 
112
  Final loss reached: ~3.25
113