Update model card with final TensorBoard training metrics
Browse files
README.md
CHANGED
|
@@ -3,18 +3,18 @@ library_name: transformers
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: LiquidAI/LFM2.5-1.2B-Instruct
|
| 5 |
tags:
|
| 6 |
-
- unsloth
|
| 7 |
-
- lfm
|
| 8 |
-
- sft
|
| 9 |
-
- fine-tuned
|
| 10 |
-
- liquid
|
| 11 |
-
- lora
|
| 12 |
-
- trl
|
| 13 |
-
- hf_jobs
|
| 14 |
datasets:
|
| 15 |
-
- mlabonne/FineTome-100k
|
| 16 |
language:
|
| 17 |
-
- en
|
| 18 |
pipeline_tag: text-generation
|
| 19 |
---
|
| 20 |
|
|
@@ -45,13 +45,21 @@ Trained using [Unsloth](https://github.com/unslothai/unsloth) SFT on [Hugging Fa
|
|
| 45 |
| **Rank (r)** | 16 |
|
| 46 |
| **Trainable parameters** | 11,108,352 / 1,181,448,960 (0.94%) |
|
| 47 |
|
| 48 |
-
### Training Metrics
|
| 49 |
|
| 50 |
| Step | Loss | Grad Norm | Learning Rate | Epoch |
|
| 51 |
|---|---|---|---|---|
|
| 52 |
-
|
|
| 53 |
-
| 2,
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
## Usage
|
| 57 |
|
|
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: LiquidAI/LFM2.5-1.2B-Instruct
|
| 5 |
tags:
|
| 6 |
+
- unsloth
|
| 7 |
+
- lfm
|
| 8 |
+
- sft
|
| 9 |
+
- fine-tuned
|
| 10 |
+
- liquid
|
| 11 |
+
- lora
|
| 12 |
+
- trl
|
| 13 |
+
- hf_jobs
|
| 14 |
datasets:
|
| 15 |
+
- mlabonne/FineTome-100k
|
| 16 |
language:
|
| 17 |
+
- en
|
| 18 |
pipeline_tag: text-generation
|
| 19 |
---
|
| 20 |
|
|
|
|
| 45 |
| **Rank (r)** | 16 |
|
| 46 |
| **Trainable parameters** | 11,108,352 / 1,181,448,960 (0.94%) |
|
| 47 |
|
| 48 |
+
### Training Metrics (from TensorBoard)
|
| 49 |
|
| 50 |
| Step | Loss | Grad Norm | Learning Rate | Epoch |
|
| 51 |
|---|---|---|---|---|
|
| 52 |
+
| 1,000 | 0.6984 | 0.349 | 1.80e-4 | 0.10 |
|
| 53 |
+
| 2,000 | 0.6898 | 0.298 | 1.60e-4 | 0.20 |
|
| 54 |
+
| 3,000 | 0.6696 | 0.266 | 1.40e-4 | 0.30 |
|
| 55 |
+
| 4,000 | 0.6694 | 0.523 | 1.20e-4 | 0.40 |
|
| 56 |
+
| 5,000 | 0.6697 | 0.356 | 1.00e-4 | 0.50 |
|
| 57 |
+
| 6,000 | 0.6766 | 0.367 | 8.00e-5 | 0.60 |
|
| 58 |
+
| 7,000 | 0.6574 | 0.426 | 6.00e-5 | 0.70 |
|
| 59 |
+
| 8,000 | 0.6562 | 0.387 | 4.00e-5 | 0.80 |
|
| 60 |
+
| 9,000 | 0.6673 | 0.516 | 2.00e-5 | 0.90 |
|
| 61 |
+
|
| 62 |
+
**Final training loss: 0.6562** (at step 8,000). Loss decreased from 0.6984 to 0.6562 over the course of training (~6% reduction).
|
| 63 |
|
| 64 |
## Usage
|
| 65 |
|