Text Generation
PEFT
TensorBoard
Safetensors
Generated from Trainer
hf_jobs
trl
unsloth
sft
lora
conversational
Instructions to use MenemAI/lfm-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use MenemAI/lfm-finetuned with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/LFM2.5-1.2B-Instruct") model = PeftModel.from_pretrained(base_model, "MenemAI/lfm-finetuned") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use MenemAI/lfm-finetuned with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MenemAI/lfm-finetuned to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MenemAI/lfm-finetuned to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MenemAI/lfm-finetuned to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="MenemAI/lfm-finetuned", max_seq_length=2048, )
File size: 4,556 Bytes
8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd b9d3855 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 8d2a9cd 0055af1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | ---
base_model: unsloth/LFM2.5-1.2B-Instruct
library_name: peft
model_name: lfm-finetuned
pipeline_tag: text-generation
tags:
- generated_from_trainer
- hf_jobs
- trl
- unsloth
- sft
- lora
- peft
licence: license
datasets:
- mlabonne/FineTome-100k
---
# lfm-finetuned
A LoRA adapter fine-tuned on top of [`unsloth/LFM2.5-1.2B-Instruct`](https://huggingface.co/unsloth/LFM2.5-1.2B-Instruct), trained with [TRL](https://github.com/huggingface/trl)'s SFT trainer on [`mlabonne/FineTome-100k`](https://huggingface.co/datasets/mlabonne/FineTome-100k).
> **Note:** this repo contains the **LoRA adapter only** (`adapter_model.safetensors` + `adapter_config.json`), not a full standalone model. Load it on top of the base model with `peft`, or merge it once and use it as a regular causal LM (see below).
## Install
```bash
pip install -U torch transformers peft accelerate
```
## Quick start — load the adapter on top of the base model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
base_id = "unsloth/LFM2.5-1.2B-Instruct"
adapter_id = "MenemAI/lfm-finetuned"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
base_id,
torch_dtype="auto",
device_map="cuda",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter_id)
model.eval()
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
output = generator(
[{"role": "user", "content": question}],
max_new_tokens=512,
return_full_text=False,
)[0]
print(output["generated_text"])
```
CPU-only? Drop `device_map="cuda"` and pass `device_map="cpu"` (or `"auto"`); generation will be slow but works.
## Run on Hugging Face Jobs
The script below works as-is with `hf jobs uv run`. The PEP 723 header makes `uv` install the right deps inside the job.
```python
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "torch",
# "transformers",
# "peft",
# "accelerate",
# ]
# ///
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
base_id = "unsloth/LFM2.5-1.2B-Instruct"
adapter_id = "MenemAI/lfm-finetuned"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
base_id, torch_dtype="auto", device_map="cuda", trust_remote_code=True
)
model = PeftModel.from_pretrained(base, adapter_id).eval()
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(generator(
[{"role": "user", "content": "Hello!"}],
max_new_tokens=512,
return_full_text=False,
)[0]["generated_text"])
```
```bash
hf jobs uv run --flavor a10g-small ./test.py
```
## Optional — merge the adapter into the base model
If you want a single self-contained checkpoint (faster cold start, no `peft` at inference time):
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"unsloth/LFM2.5-1.2B-Instruct", torch_dtype="auto", trust_remote_code=True
)
merged = PeftModel.from_pretrained(base, "MenemAI/lfm-finetuned").merge_and_unload()
merged.save_pretrained("lfm-merged")
AutoTokenizer.from_pretrained("MenemAI/lfm-finetuned", trust_remote_code=True).save_pretrained("lfm-merged")
```
After merging you can load it with a plain `pipeline("text-generation", model="./lfm-merged", device="cuda")` or push it to a new repo with `hf upload <your-user>/lfm-merged ./lfm-merged`.
## Training
- **Method:** SFT via TRL
- **Base model:** `unsloth/LFM2.5-1.2B-Instruct`
- **Dataset:** `mlabonne/FineTome-100k`
- **Acceleration:** Unsloth
- **Infrastructure:** Hugging Face Jobs
### Framework versions
- TRL: 0.22.2
- Transformers: 4.57.3
- PyTorch: 2.10.0
- Datasets: 4.3.0
- Tokenizers: 0.22.2
- PEFT: required at inference time when loading the adapter directly
## Citations
```bibtex
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
```
|