Update README.md
Browse files
README.md
CHANGED
|
@@ -9,8 +9,32 @@ base_model:
|
|
| 9 |
- LiquidAI/LFM2.5-1.2B-Instruct
|
| 10 |
---
|
| 11 |
|
| 12 |
-
## Training
|
| 13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
{
|
| 15 |
"_runtime": 348,
|
| 16 |
"_step": 60,
|
|
|
|
| 9 |
- LiquidAI/LFM2.5-1.2B-Instruct
|
| 10 |
---
|
| 11 |
|
| 12 |
+
## 📉 Training Results & Metrics
|
| 13 |
|
| 14 |
+
This model was fine-tuned on **Liquid LFM-2.5-1.2B-Instruct** using **Unsloth** and T4 GPUs. The following metrics were recorded during the final training run.
|
| 15 |
+
|
| 16 |
+
| Metric | Value | Description |
|
| 17 |
+
| :--- | :--- | :--- |
|
| 18 |
+
| **Final Loss** | `0.7431` | The model's error rate at the final step. |
|
| 19 |
+
| **Average Train Loss** | `0.8274` | The average error rate across the entire session. |
|
| 20 |
+
| **Epochs** | `0.96` | completed ~1 full pass over the dataset. |
|
| 21 |
+
| **Global Steps** | `60` | Total number of optimizer updates. |
|
| 22 |
+
| **Runtime** | `594s` (~10 min) | Total wall-clock time for training. |
|
| 23 |
+
| **Samples/Second** | `0.808` | Throughput speed on T4 GPU. |
|
| 24 |
+
| **Gradient Norm** | `0.345` | Indicates stable training (no exploding gradients). |
|
| 25 |
+
| **Learning Rate** | `3.64e-6` | Final learning rate after decay. |
|
| 26 |
+
| **Total FLOS** | `2.07e15` | Total floating-point operations computed. |
|
| 27 |
+
|
| 28 |
+
### 🛠️ Hardware & Framework
|
| 29 |
+
* **Hardware:** NVIDIA Tesla T4 (Google Colab Free Tier)
|
| 30 |
+
* **Framework:** Unsloth (PyTorch)
|
| 31 |
+
* **Quantization:** 4-bit (QLoRA)
|
| 32 |
+
* **Optimizer:** AdamW 8-bit
|
| 33 |
+
|
| 34 |
+
<details>
|
| 35 |
+
<summary><strong>View Raw Training Log (JSON)</strong></summary>
|
| 36 |
+
|
| 37 |
+
```json
|
| 38 |
{
|
| 39 |
"_runtime": 348,
|
| 40 |
"_step": 60,
|