File size: 2,162 Bytes
eb2e9e0 2078953 eb2e9e0 2078953 eb2e9e0 2078953 451b700 2078953 23004d2 451b700 2078953 23004d2 2078953 23004d2 451b700 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ---
license: mit
datasets:
- glaiveai/glaive-function-calling-v2
- nickrosh/Evol-Instruct-Code-80k-v1
language:
- en
base_model:
- LiquidAI/LFM2.5-1.2B-Instruct
tags:
- tool-use
- code
- unsloth
- liquid
- fine-tune
library_name: unsloth
---
# 🧠 LFM-2.5-1.2B-Coding-Tools
This is a fine-tuned version of **Liquid LFM-2.5-1.2B-Instruct**, specialized for **Python coding** and **native tool calling**. It was trained using [Unsloth](https://github.com/unslothai/unsloth) on a hybrid dataset of coding instructions and Pythonic function calls.
## 📉 Training Results & Metrics
This model was fine-tuned on a Google Colab **Tesla T4** instance. The following metrics were recorded during the final training run.
| Metric | Value | Description |
| :--- | :--- | :--- |
| **Final Loss** | `0.7431` | The model's error rate at the final step. |
| **Average Train Loss** | `0.8274` | The average error rate across the entire session. |
| **Epochs** | `0.96` | Completed ~1 full pass over the dataset. |
| **Global Steps** | `60` | Total number of optimizer updates. |
| **Runtime** | `594s` (~10 min) | Total wall-clock time for training. |
| **Samples/Second** | `0.808` | Throughput speed on T4 GPU. |
| **Gradient Norm** | `0.345` | Indicates stable training (no exploding gradients). |
| **Learning Rate** | `3.64e-6` | Final learning rate after decay. |
| **Total FLOS** | `2.07e15` | Total floating-point operations computed. |
### 🛠️ Hardware & Framework
* **Hardware:** NVIDIA Tesla T4 (Google Colab Free Tier)
* **Framework:** Unsloth (PyTorch)
* **Quantization:** 4-bit (QLoRA)
* **Optimizer:** AdamW 8-bit
<details>
<summary><strong>View Raw Training Log (JSON)</strong></summary>
```json
{
"_runtime": 348,
"_step": 60,
"_timestamp": 1770910365.0772636,
"_wandb.runtime": 348,
"total_flos": 2069937718053888,
"train/epoch": 0.96,
"train/global_step": 60,
"train/grad_norm": 0.3452725112438202,
"train/learning_rate": 0.000003636363636363636,
"train/loss": 0.7431,
"train_loss": 0.8273822158575058,
"train_runtime": 594.2969,
"train_samples_per_second": 0.808,
"train_steps_per_second": 0.101
} |