LocoTrainer

MODEL GGUF GitHub

Introduction

LocoTrainer-4B is a 4B-parameter MS-SWIFT domain expert agent trained via knowledge distillation from Qwen3-Coder-Next. Unlike general-purpose code agents, it combines multi-turn tool-calling with deep MS-SWIFT framework knowledge โ€” enabling it to analyze codebases and generate comprehensive markdown reports without a separate reasoning model.

LocoTrainer-4B
Base Model Qwen3-4B-Instruct-2507
Teacher Model Qwen3-Coder-Next
Training Method Full-parameter SFT (distillation)
Training Data 361,830 samples (agent trajectory + MS-SWIFT knowledge + project paths)
Max Sequence Length 32,768 tokens
Training Hardware 8x NVIDIA H100 80GB
Training Time ~25 hours
Framework MS-SWIFT

Key Features

  • MS-SWIFT Domain Expert: Trained on MS-SWIFT documentation, CLI parameters, and project structure paths โ€” answers framework questions accurately
  • Tool-Calling Agent: Generates structured <tool_call> JSON for Read, Grep, Glob, Bash, and Write tools
  • End-to-End Reports: From a single question to a complete, well-structured markdown analysis report
  • Long Context: 32K training covers 90% of long-context analysis scenarios
  • Local Deployment: GGUF quantized version available for zero API cost inference

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "LocoreMind/LocoTrainer-4B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {
        "role": "system",
        "content": "You are Claude Code, Anthropic's official CLI for Claude.\n\nYou are an interactive agent that helps users with software engineering tasks.\n\nCRITICAL CONSTRAINTS:\n1. ALWAYS use absolute file paths in tool calls.\n2. EFFICIENCY: Use multiple tool calls to explore the codebase.\n3. OUTPUT: Save your findings as a well-structured markdown document.\n\nENV: Working directory is /Users/developer/workspace (macOS, zsh)."
    },
    {
        "role": "user",
        "content": "What are the default LoRA settings in ms-swift?\n\nAnalyze the codebase at /Users/developer/workspace/ms-swift and save your findings as a well-structured markdown document to /Users/developer/workspace/output/output.md."
    }
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

content = tokenizer.decode(output_ids, skip_special_tokens=True)
print(content)

LocoTrainer Framework

LocoTrainer-4B is designed to run inside the LocoTrainer agent framework, which handles the full agent loop โ€” tool execution, multi-turn conversation, and report generation.

pip install locotrainer

locotrainer run -q "What are the default LoRA settings in ms-swift?"
# โ†’ output/output.md

For full setup and usage, refer to the GitHub repository.

Training Details

Parameter Value
Base model Qwen3-4B-Instruct-2507
Teacher model Qwen3-Coder-Next
Method Full-parameter SFT
Training data 361,830 samples
Data composition Agent trajectory + MS-SWIFT knowledge + project structure paths
Hardware 8x NVIDIA H100 80GB
DeepSpeed ZeRO-2
Precision BF16
Epochs 1
Max sequence length 32,768 tokens
Attention Flash Attention 2
Kernel optimization Liger Kernel
Learning rate 1e-5, warmup ratio 0.05
Batch size 1/GPU, gradient accumulation 4 (effective batch 32)
Template qwen3_nothinking
Framework MS-SWIFT
Training time ~25 hours

Known Limitations

  • Specialized for MS-SWIFT; performance on unrelated codebases is untested
  • 4B parameters โ€” complex multi-hop reasoning may require a larger model
  • MS-SWIFT project structure knowledge reflects the training data snapshot; may drift as the framework evolves

License

MIT

Acknowledgments

  • Qwen Team for the Qwen3-4B-Instruct-2507 base model
  • MS-SWIFT for the training framework and the codebase this model specializes in
  • llama.cpp for efficient local inference
  • Anthropic for the Claude Code agent loop design that inspired this work
Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LocoreMind/LocoTrainer-4B

Finetuned
(1407)
this model