Instructions to use AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt")
model = AutoModelForCausalLM.from_pretrained("AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt

SGLang

How to use AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt with Docker Model Runner:
```
docker model run hf.co/AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

BTGenBot-2

AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt is the fine-tuned Llama 3.2 1B Instruct model released with BTGenBot-2: Efficient Behavior Tree Generation with Small Language Models.

BTGenBot-2 generates executable robot Behavior Trees from:

a natural-language task description, and
a list of available robot action primitives.

The model outputs XML Behavior Trees compatible with BehaviorTree.CPP, supporting ROS 2 robotics behavior-tree pipelines.

For the complete project, examples, code, dataset, benchmark, and paper, visit:

👉 https://airlab-polimi.github.io/BTGenBot-2/

Model Details

Developed by: AIRLab, Politecnico di Milano
Authors: Riccardo Andrea Izzo, Gianluca Bardaro, Matteo Matteucci
Base model: meta-llama/Llama-3.2-1B-Instruct
Model type: Small language model for Behavior Tree generation
Fine-tuning method: QLoRA / LoRA parameter-efficient fine-tuning
Input: Natural-language robot task + available robot action primitives
Output: XML Behavior Tree compatible with BehaviorTree.CPP
Language: English
Project page: https://airlab-polimi.github.io/BTGenBot-2/
Paper: https://arxiv.org/abs/2602.01870
Code: https://github.com/AIRLab-POLIMI/BTGenBot-2

Intended Use

This model is intended to generate robot Behavior Trees for research and development in robotics task planning.

Example applications include:

ROS 2 / Nav2-compatible task planning;
navigation Behavior Tree generation;
manipulation Behavior Tree generation;
simulation-based robot-task validation;
benchmarking language-model-based Behavior Tree generation.

Input Format

The recommended input format is:

Task:
Describe the robot task in natural language.

Actions:
[ActionName(parameters: parameter_1, parameter_2), AnotherAction(parameters: parameter_1)]

Output Format

The model is expected to return XML only:

<root BTCPP_format="4">
  <BehaviorTree ID="MainTree">
    ...
  </BehaviorTree>
</root>

Training Data

BTGenBot-2 was trained on a synthetic instruction-following dataset of 5,204 natural-language instruction / Behavior Tree pairs.

Each sample contains:

instruction: system-level instructions for Behavior Tree generation;
input: task description and available robot actions;
output: XML Behavior Tree.

The dataset was generated from real Behavior Trees and expanded through controlled synthetic generation.

See the full project page for details:

https://airlab-polimi.github.io/BTGenBot-2/

Training Procedure

The model was fine-tuned from meta-llama/Llama-3.2-1B-Instruct using QLoRA / LoRA.

Reported training details from the paper include:

Train/test split: 95% / 5%
Learning rate: 1e-4
Warmup ratio: 0.1
Batch size: 16
Training duration: approximately 30 hours
Hardware: 2 × NVIDIA RTX Quadro 6000 GPUs, 48 GB total VRAM

Citation

@article{izzo2026btgenbot,
  title={BTGenBot-2: Efficient Behavior Tree Generation with Small Language Models},
  author={Izzo, Riccardo Andrea and Bardaro, Gianluca and Matteucci, Matteo},
  journal={arXiv preprint arXiv:2602.01870},
  year={2026}
}

More Information

Project page: https://airlab-polimi.github.io/BTGenBot-2/
Code: https://github.com/AIRLab-POLIMI/BTGenBot-2
Paper: https://arxiv.org/abs/2602.01870

Downloads last month: 47

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt

Base model

meta-llama/Llama-3.2-1B-Instruct

Finetuned

(1784)

this model

Dataset used to train AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt

Collection including AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt

BTGenBot-2

Collection

3 items • Updated Jun 12 • 1

Paper for AIRLab-POLIMI/llama-3.2-1b-it-ft-lora-bt

BTGenBot-2: Efficient Behavior Tree Generation with Small Language Models

Paper • 2602.01870 • Published Feb 2

Evaluation results

Zero-shot Success Rate with Error Recovery on BT Benchmark
self-reported

90.380
One-shot Success Rate with Error Recovery on BT Benchmark
self-reported

98.070
XML Syntax Correctness with Error Recovery on BT Benchmark
self-reported

100.000
Action Coherency with Error Recovery on BT Benchmark
self-reported

100.000