MistralClaw - LoRA Adapter for AI Agent Orchestration

Fine-tuned Mistral-7B-Instruct-v0.2 for multi-step tool calling, function calling, and AI agent orchestration.

Built for OpenClaw - an open-source personal AI agent.

Training Details

  • Base Model: mistralai/Mistral-7B-Instruct-v0.2
  • Method: LoRA (r=64, alpha=128, all attention + MLP layers)
  • Platform: Together AI managed fine-tuning
  • Dataset: 13,393 examples from 5 sources (12,054 train / 1,339 val)
  • Epochs: 3
  • Final Eval Loss: 0.4219
  • Training Time: ~10 minutes on A100

Training Data Sources

Source Examples Purpose
Salesforce/xlam-function-calling-60k 5,000 Verified function calling
glaiveai/glaive-function-calling-v2 5,000 Multi-turn tool use
NousResearch/hermes-function-calling-v1 1,893 Hermes-style tool calls
teknium/OpenHermes-2.5 1,500 No-tool knowledge (negative examples)

Loss Curve

  • Epoch 1: 0.4725
  • Epoch 2: 0.4278
  • Epoch 3: 0.4219

Usage

With PEFT/Transformers

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
model = PeftModel.from_pretrained(base_model, "padmanabh/mistralclaw-lora")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")

Tool Call Format

The model outputs tool calls as text with the [TOOL_CALLS] prefix:

[TOOL_CALLS] [{"name": "gmail_send", "arguments": "{\"to\": \"john@example.com\", \"subject\": \"Meeting\", \"body\": \"See you tomorrow\"}"}]

Tool results are provided as user messages with [TOOL_RESULT]:

[TOOL_RESULT] gmail_send: {"status": "success", "message_id": "abc123"}

W&B Dashboard

Training tracked at: wandb.ai/padmanabhg-freelance/together

License

Apache 2.0

Hackathon

Built for Mistral AI Worldwide Hackathon - Tokyo Edition | Track 02: Fine-Tuning by W&B

Framework versions

  • PEFT 0.15.1
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for padmanabh/mistralclaw-lora

Adapter
(1115)
this model

Datasets used to train padmanabh/mistralclaw-lora

Collection including padmanabh/mistralclaw-lora