LFM25-1.2B-CodeAgent-opus4-default
A LoRA fine-tuned adapter for LiquidAI/LFM2.5-1.2B-Instruct trained to follow the smolagents CodeAgent format.
Model Description
This adapter teaches LFM2.5-1.2B to respond in the structured Thought + Code format required by smolagents CodeAgent:
Thought: I need to calculate this.
```python
result = 2 + 2
final_answer(result)
```
Key Features
- Base Model: LiquidAI/LFM2.5-1.2B-Instruct (1.17B parameters)
- Token Accuracy: 90.2% on training data
- Single-Turn Rate: 53.8% of tasks solved in one turn
- Adapter Size: ~42MB (LoRA rank=16, alpha=32)
Training Details
Training Data
- 130 successful CodeAgent trajectories generated using Claude Opus 4.5
- Tasks include mathematical reasoning, string manipulation, file operations, and general problem-solving
- Each trajectory demonstrates the Thought → Code → Observation → final_answer pattern
- Average trajectory length: 2971 tokens
Training Configuration
| Parameter | Value |
|---|---|
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| Target Modules | w1, w2, w3, q_proj, k_proj, v_proj, out_proj, in_proj |
| Trainable Parameters | 11.1M (0.94% of base model) |
| Epochs | 3 |
| Learning Rate | 1e-4 |
| Batch Size | 2 (effective 8 with gradient accumulation) |
| Max Sequence Length | 8192 |
| Hardware | NVIDIA RTX 3090 (24GB) |
| Training Time | 260s |
Training Framework
Ablation Study Results
This model is part of a teacher ablation study comparing different Claude models and prompting strategies:
| Teacher Config | Token Accuracy | Training Time | Avg Trajectory Tokens |
|---|---|---|---|
| haiku-default | 73.8% | 461s | 2,957 |
| sonnet-default | 90.6% | 260s | 3,054 |
| opus4-default | 90.2% | 260s | 2,971 |
| sonnet4-terse | 94.9% | 93s | 632 |
| opus4-terse | 95.0% | 76s | 613 |
Key Finding: Terse, focused trajectories from capable teachers transfer significantly better to small models.
Usage
With PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"LiquidAI/LFM2.5-1.2B-Instruct",
device_map="auto",
torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "krzysztofwos/LFM25-1.2B-CodeAgent-opus4-default")
# Generate
messages = [{"role": "user", "content": "What is 15 * 23?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
top_p=0.1,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With smolagents
from smolagents import CodeAgent, FinalAnswerTool, TransformersModel
model = TransformersModel(
model_id="LiquidAI/LFM2.5-1.2B-Instruct",
peft_model="krzysztofwos/LFM25-1.2B-CodeAgent-opus4-default",
)
agent = CodeAgent(
tools=[FinalAnswerTool()],
model=model,
)
result = agent.run("What is 15 * 23?")
print(result)
Intended Use
- Code-assisted problem solving with small, efficient models
- Mathematical reasoning tasks
- Automated code generation following structured formats
- Research into prompt distillation and teacher model selection
Limitations
- 1.2B model constraints: Limited reasoning depth compared to larger models
- English only: Trained on English-language tasks
- CodeAgent format specific: Optimized for smolagents Thought/Code/final_answer pattern
Citation
If you use this model, please cite:
@misc{lfm25-codeagent-opus4_default-2025,
author = {krzysztofwos},
title = {LFM25-1.2B-CodeAgent-opus4-default},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/krzysztofwos/LFM25-1.2B-CodeAgent-opus4-default}
}
Acknowledgments
- LiquidAI for the LFM2.5 base model
- Anthropic for Claude Opus 4.5 (teacher model)
- Hugging Face for smolagents, TRL, and PEFT
- Downloads last month
- 33
Model tree for krzysztofwos/LFM25-1.2B-CodeAgent-opus4-default
Base model
LiquidAI/LFM2.5-1.2B-Base
Finetuned
LiquidAI/LFM2.5-1.2B-Instruct