AGORA β€” Multi-Robot Task Planner v1

Part of the ANIMA Perception Suite by Robot Flow Labs / AIFLOW LABS LIMITED.

Overview

AGORA (Adaptive Group Operations & Resource Allocation) is the Wave-5 unified STEM (Spatio-Temporal-Embodiment Memory) framework for multi-robot collaboration. This repo contains the LoRA-finetuned planner model and all training artifacts for reproducibility.

The planner takes a heterogeneous robot team state (capabilities, battery, location, failure history) and a set of task requests, then produces optimal task-to-robot assignments with structured JSON reasoning.

Status: WEIGHTS PENDING RETRAIN

The original trained weights (merged model 2.9GB + LoRA adapter 71MB) were lost during a server disk cleanup. All training data, scripts, configs, and metrics are preserved for full reproducibility. Retrain with:

cd /path/to/project_agora
CUDA_VISIBLE_DEVICES=0,1 python scripts/train_planner.py

Estimated training time: ~2 hours on 2x NVIDIA L4.

Architecture

  • Base model: Qwen2.5-1.5B-Instruct
  • Fine-tuning: LoRA (r=16, alpha=32, dropout=0.05)
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Task: Multi-robot task allocation (structured JSON output)

Training Results

Metric Value
Train loss 0.234
Train runtime 1h 50m (6620s)
Train samples/sec 2.27
Total steps 939 (3 epochs)
Eval format valid rate 100%
Eval task coverage 53.3%
Eval robot match rate 18.7%
Avg inference time 4.50s

Analysis: The model produces valid JSON 100% of the time (perfect format learning). Task allocation quality differs from the heuristic baseline, expected for a 1.5B model learning complex multi-factor optimization. Recommended improvements: more training data (10k-50k examples), larger base model (Qwen2.5-7B), or RLHF/DPO stage.

Repository Contents

.
β”œβ”€β”€ README.md                              # This model card
β”œβ”€β”€ configs/
β”‚   └── training.toml                      # Full training configuration
β”œβ”€β”€ logs/
β”‚   β”œβ”€β”€ training_metrics.json              # Final training metrics
β”‚   β”œβ”€β”€ planning_train.jsonl               # 5000 training examples
β”‚   └── planning_eval.jsonl                # 200 evaluation examples
β”œβ”€β”€ reports/
β”‚   └── planner_eval.json                  # Evaluation results
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ train_planner.py                   # LoRA fine-tuning script
β”‚   β”œβ”€β”€ eval_planner.py                    # Evaluation script
β”‚   └── generate_planning_data.py          # Synthetic data generator
β”œβ”€β”€ pytorch/                               # (empty β€” retrain to populate)
β”‚   β”œβ”€β”€ merged/                            # Merged model weights
β”‚   └── lora_adapter/                      # LoRA adapter only
β”œβ”€β”€ onnx/                                  # (empty β€” export after retrain)
β”œβ”€β”€ checkpoints/                           # (empty β€” retrain to populate)
└── tensorrt/                              # (generate on target hardware)

Reproducing the Training

Prerequisites

  • 2x NVIDIA L4 (23GB each) or equivalent
  • CUDA 12.4, PyTorch 2.6.0
  • transformers, peft, trl, datasets

Step 1: Generate training data (if starting fresh)

python scripts/generate_planning_data.py

Step 2: Train

CUDA_VISIBLE_DEVICES=0,1 python scripts/train_planner.py \
    --model Qwen/Qwen2.5-1.5B-Instruct \
    --epochs 3 \
    --batch-size 4 \
    --grad-accum 4 \
    --lr 2e-4 \
    --lora-r 16 \
    --lora-alpha 32

Step 3: Evaluate

python scripts/eval_planner.py

Step 4: Export (after training)

# SafeTensors
from safetensors.torch import save_file
save_file(model.state_dict(), "pytorch/agora_planner_v1.safetensors")

# ONNX
torch.onnx.export(model, dummy_input, "onnx/agora_planner_v1.onnx", opset_version=17)

Usage (after weights are available)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ilessio-aiflowlab/project_agora", subfolder="pytorch/merged")
tokenizer = AutoTokenizer.from_pretrained("ilessio-aiflowlab/project_agora", subfolder="pytorch/merged")

prompt = """You are a multi-robot task planner. Given the team state and tasks, produce assignments.

Team: [{"id": "robot_1", "type": "manipulator", "battery": 85, "location": "kitchen"}]
Tasks: [{"id": "task_1", "type": "fetch", "target": "cup", "location": "kitchen"}]

Assignments:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data Format

Each example in planning_train.jsonl is a conversation turn:

{
  "messages": [
    {"role": "system", "content": "You are AGORA, a multi-robot task planner..."},
    {"role": "user", "content": "Team state: {...}\nTasks: [...]"},
    {"role": "assistant", "content": "{\"assignments\": [...], \"reasoning\": \"...\"}"}
  ]
}

AGORA Framework

The planner is one component of the larger AGORA system:

  • STEM Memory β€” Spatio-Temporal-Embodiment Memory for shared state
  • Brain/Cerebellum β€” Global planning + local execution control loop
  • Fleet Coordination β€” Multi-robot orchestration with fault tolerance
  • Inference β€” LLM/MLLM integration (this model)

Hardware

  • Training: 2x NVIDIA L4 (23GB each), CUDA 12.4
  • Inference: Any GPU with >= 4GB VRAM (quantized) or >= 8GB (bf16)
  • Edge target: NVIDIA Jetson Orin (after TensorRT export)

License

Apache 2.0 β€” Robot Flow Labs / AIFLOW LABS LIMITED

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ilessio-aiflowlab/project_agora

Adapter
(742)
this model

Evaluation results