Text Generation
PEFT
TensorBoard
Safetensors
Generated from Trainer
hf_jobs
trl
unsloth
sft
lora
conversational
Instructions to use MenemAI/lfm-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use MenemAI/lfm-finetuned with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/LFM2.5-1.2B-Instruct") model = PeftModel.from_pretrained(base_model, "MenemAI/lfm-finetuned") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use MenemAI/lfm-finetuned with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MenemAI/lfm-finetuned to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MenemAI/lfm-finetuned to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MenemAI/lfm-finetuned to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="MenemAI/lfm-finetuned", max_seq_length=2048, )
| base_model: unsloth/LFM2.5-1.2B-Instruct | |
| library_name: peft | |
| model_name: lfm-finetuned | |
| pipeline_tag: text-generation | |
| tags: | |
| - generated_from_trainer | |
| - hf_jobs | |
| - trl | |
| - unsloth | |
| - sft | |
| - lora | |
| - peft | |
| licence: license | |
| datasets: | |
| - mlabonne/FineTome-100k | |
| # lfm-finetuned | |
| A LoRA adapter fine-tuned on top of [`unsloth/LFM2.5-1.2B-Instruct`](https://huggingface.co/unsloth/LFM2.5-1.2B-Instruct), trained with [TRL](https://github.com/huggingface/trl)'s SFT trainer on [`mlabonne/FineTome-100k`](https://huggingface.co/datasets/mlabonne/FineTome-100k). | |
| > **Note:** this repo contains the **LoRA adapter only** (`adapter_model.safetensors` + `adapter_config.json`), not a full standalone model. Load it on top of the base model with `peft`, or merge it once and use it as a regular causal LM (see below). | |
| ## Install | |
| ```bash | |
| pip install -U torch transformers peft accelerate | |
| ``` | |
| ## Quick start — load the adapter on top of the base model | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline | |
| from peft import PeftModel | |
| base_id = "unsloth/LFM2.5-1.2B-Instruct" | |
| adapter_id = "MenemAI/lfm-finetuned" | |
| tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True) | |
| base = AutoModelForCausalLM.from_pretrained( | |
| base_id, | |
| torch_dtype="auto", | |
| device_map="cuda", | |
| trust_remote_code=True, | |
| ) | |
| model = PeftModel.from_pretrained(base, adapter_id) | |
| model.eval() | |
| generator = pipeline("text-generation", model=model, tokenizer=tokenizer) | |
| question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?" | |
| output = generator( | |
| [{"role": "user", "content": question}], | |
| max_new_tokens=512, | |
| return_full_text=False, | |
| )[0] | |
| print(output["generated_text"]) | |
| ``` | |
| CPU-only? Drop `device_map="cuda"` and pass `device_map="cpu"` (or `"auto"`); generation will be slow but works. | |
| ## Run on Hugging Face Jobs | |
| The script below works as-is with `hf jobs uv run`. The PEP 723 header makes `uv` install the right deps inside the job. | |
| ```python | |
| # /// script | |
| # requires-python = ">=3.10" | |
| # dependencies = [ | |
| # "torch", | |
| # "transformers", | |
| # "peft", | |
| # "accelerate", | |
| # ] | |
| # /// | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline | |
| from peft import PeftModel | |
| base_id = "unsloth/LFM2.5-1.2B-Instruct" | |
| adapter_id = "MenemAI/lfm-finetuned" | |
| tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True) | |
| base = AutoModelForCausalLM.from_pretrained( | |
| base_id, torch_dtype="auto", device_map="cuda", trust_remote_code=True | |
| ) | |
| model = PeftModel.from_pretrained(base, adapter_id).eval() | |
| generator = pipeline("text-generation", model=model, tokenizer=tokenizer) | |
| print(generator( | |
| [{"role": "user", "content": "Hello!"}], | |
| max_new_tokens=512, | |
| return_full_text=False, | |
| )[0]["generated_text"]) | |
| ``` | |
| ```bash | |
| hf jobs uv run --flavor a10g-small ./test.py | |
| ``` | |
| ## Optional — merge the adapter into the base model | |
| If you want a single self-contained checkpoint (faster cold start, no `peft` at inference time): | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| base = AutoModelForCausalLM.from_pretrained( | |
| "unsloth/LFM2.5-1.2B-Instruct", torch_dtype="auto", trust_remote_code=True | |
| ) | |
| merged = PeftModel.from_pretrained(base, "MenemAI/lfm-finetuned").merge_and_unload() | |
| merged.save_pretrained("lfm-merged") | |
| AutoTokenizer.from_pretrained("MenemAI/lfm-finetuned", trust_remote_code=True).save_pretrained("lfm-merged") | |
| ``` | |
| After merging you can load it with a plain `pipeline("text-generation", model="./lfm-merged", device="cuda")` or push it to a new repo with `hf upload <your-user>/lfm-merged ./lfm-merged`. | |
| ## Training | |
| - **Method:** SFT via TRL | |
| - **Base model:** `unsloth/LFM2.5-1.2B-Instruct` | |
| - **Dataset:** `mlabonne/FineTome-100k` | |
| - **Acceleration:** Unsloth | |
| - **Infrastructure:** Hugging Face Jobs | |
| ### Framework versions | |
| - TRL: 0.22.2 | |
| - Transformers: 4.57.3 | |
| - PyTorch: 2.10.0 | |
| - Datasets: 4.3.0 | |
| - Tokenizers: 0.22.2 | |
| - PEFT: required at inference time when loading the adapter directly | |
| ## Citations | |
| ```bibtex | |
| @misc{vonwerra2022trl, | |
| title = {{TRL: Transformer Reinforcement Learning}}, | |
| author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec}, | |
| year = 2020, | |
| journal = {GitHub repository}, | |
| publisher = {GitHub}, | |
| howpublished = {\url{https://github.com/huggingface/trl}} | |
| } | |
| ``` | |