Tigdora
/

lfm-2.5-coding-tool_gguf

Model card Files Files and versions

lfm-2.5-coding-tool_gguf / README.md

Tigdora's picture

Update README.md

2078953 verified about 2 months ago

|

history blame contribute delete

2.16 kB

	---
	license: mit
	datasets:
	- glaiveai/glaive-function-calling-v2
	- nickrosh/Evol-Instruct-Code-80k-v1
	language:
	- en
	base_model:
	- LiquidAI/LFM2.5-1.2B-Instruct
	tags:
	- tool-use
	- code
	- unsloth
	- liquid
	- fine-tune
	library_name: unsloth
	---

	# 🧠 LFM-2.5-1.2B-Coding-Tools

	This is a fine-tuned version of Liquid LFM-2.5-1.2B-Instruct, specialized for Python coding and native tool calling. It was trained using [Unsloth](https://github.com/unslothai/unsloth) on a hybrid dataset of coding instructions and Pythonic function calls.

	## 📉 Training Results & Metrics

	This model was fine-tuned on a Google Colab Tesla T4 instance. The following metrics were recorded during the final training run.

	\| Metric \| Value \| Description \|
	\| :--- \| :--- \| :--- \|
	\| Final Loss \| `0.7431` \| The model's error rate at the final step. \|
	\| Average Train Loss \| `0.8274` \| The average error rate across the entire session. \|
	\| Epochs \| `0.96` \| Completed ~1 full pass over the dataset. \|
	\| Global Steps \| `60` \| Total number of optimizer updates. \|
	\| Runtime \| `594s` (~10 min) \| Total wall-clock time for training. \|
	\| Samples/Second \| `0.808` \| Throughput speed on T4 GPU. \|
	\| Gradient Norm \| `0.345` \| Indicates stable training (no exploding gradients). \|
	\| Learning Rate \| `3.64e-6` \| Final learning rate after decay. \|
	\| Total FLOS \| `2.07e15` \| Total floating-point operations computed. \|

	### 🛠️ Hardware & Framework
	* Hardware: NVIDIA Tesla T4 (Google Colab Free Tier)
	* Framework: Unsloth (PyTorch)
	* Quantization: 4-bit (QLoRA)
	* Optimizer: AdamW 8-bit

	<details>
	<summary><strong>View Raw Training Log (JSON)</strong></summary>

	```json
	{
	"_runtime": 348,
	"_step": 60,
	"_timestamp": 1770910365.0772636,
	"_wandb.runtime": 348,
	"total_flos": 2069937718053888,
	"train/epoch": 0.96,
	"train/global_step": 60,
	"train/grad_norm": 0.3452725112438202,
	"train/learning_rate": 0.000003636363636363636,
	"train/loss": 0.7431,
	"train_loss": 0.8273822158575058,
	"train_runtime": 594.2969,
	"train_samples_per_second": 0.808,
	"train_steps_per_second": 0.101
	}