foros-v5.3 / README.md
ameforge's picture
Update README.md
20ee058 verified
metadata
license: apache-2.0
language:
  - en
tags:
  - robotics
  - instruction-following
  - structured-generation
  - text-to-json
  - ros
  - ros2
  - sparse-transformer
  - embedded-ai
  - on-device
  - temporal-control
  - control-loop
pipeline_tag: text-generation
inference: false

Foros Robotics Action Engine

Foros is an ultra-compact 10M parameter instruction-to-JSON model designed for low-latency, on-device robotics control. It translates plain-English robot commands β€” including temporal loops, timed sequences, and FSM transitions β€” directly into structured JSON arrays of operations compatible with ROS / ROS2 and major industrial robot controllers (URScript, KRL, RAPID, Fanuc, DRL).

Developed by AMEFORGE β€” https://huggingface.co/AMFORGE. Built on the in-house SparseMind architecture (sparse token attention, sparse channel FFN, dynamic neuron typing).

Current version: v5.10 β€” production-ready, deployed on CPU / Jetson / Raspberry Pi 4.


Benchmark Results (Held-Out, 142 Curated Examples)

Foros is evaluated on a held-out test suite of 142 hand-curated robotics commands spanning 5 difficulty tiers. None of these prompts appear in the training corpus. All measurements taken on Kaggle T4 GPU, greedy decoding.

Per-Tier Breakdown β€” v5.10

Tier Description N Valid JSON Op Correct Exact Match
Tier 1 Paraphrase (novel templates) 32 100.0% 100.0% 100.0%
Tier 2 Informal (natural language) 29 100.0% 96.6% 93.1%
Tier 3 Typos & noise robustness 30 100.0% 80.0% 43.3%
Tier 4 Multi-step sequences 22 100.0% 100.0% 72.7%
Tier 5 Long chains & temporal loops 29 96.6% 96.6% 69.0%
Global (weighted) 142 99.3% 94.4% 76.1%

Version Trajectory

We report Exact Match on the held-out benchmark across successive iterations to document architectural and data improvements transparently:

Version Held-Out Exact Match Held-Out JSON Valid Notable Change
v5.4 (baseline) 62.7% ~99% Initial production deployment
v5.9 63.4% 100.0% Numerical precision standardization, deterministic pick-and-place targets
v5.10 76.1% 99.3% Refactored conditional templates, expanded informal/imperative vocabulary, integer-form numerical prompts

Head-to-Head Comparison (All Measured)

All baselines evaluated on the same 142-example held-out benchmark, same hardware (Kaggle T4 GPU), same scoring rubric, greedy decoding.

Model Exact Match Valid JSON Op Correct Latency (avg) Size
πŸš€ Foros v5.10 β€” AMEFORGE 76.1% 99.3% 94.4% 508 ms 39.6 MB
Qwen2.5-1.5B-Instruct 28.0% 60.0% 44.0% 1{,}998 ms 2{,}944 MB
Qwen2.5-0.5B-Instruct 18.0% 46.0% 24.0% 3{,}766 ms 942 MB
TinyLlama-1.1B-Chat 6.0% 22.0% 10.0% 7{,}315 ms 2{,}098 MB
SmolLM2-360M-Instruct 0.0% 6.0% 2.0% 5{,}884 ms 690 MB

Key takeaways:

  • Foros reaches 76.1% exact match on held-out robotics commands. The best general-purpose small LM evaluated (Qwen2.5-1.5B, ~150Γ— larger) reaches only 28.0% β€” Foros outperforms it by +48 percentage points.
  • The smallest comparable general-purpose LM (SmolLM2-360M, ~36Γ— larger) reaches 0.0% exact match and only 6.0% valid JSON, indicating that general-purpose small models struggle even to produce syntactically valid output on this task.
  • 4Γ— lower latency than Qwen2.5-1.5B, 14Γ— lower than TinyLlama-1.1B.
  • 17Γ— smaller than the smallest competitive baseline (SmolLM2-360M), 74Γ— smaller than Qwen2.5-1.5B.
  • Runs on Raspberry Pi 4, Jetson Nano/Orin, or any embedded CPU. No GPU required for inference, no cloud dependency, no telemetry.

Latency profile β€” atomic commands (Tier 1–3) run at ~305 ms, compound sequences (Tier 4–5) at ~860 ms. The 508 ms average reflects the full benchmark distribution including long temporal loops.


What it does

Atomic Commands

Natural Language Input Structured Output (ROS JSON)
move to x=0.5 y=-1.2 z=0.8 [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]
rotate joints to [0.0, 45.0, 90.0, 0.0, 0.0, 0.0] [{"op":"joint_move","joints":[0.0,45.0,90.0,0.0,0.0,0.0]}]
close gripper with force 0.75 [{"op":"gripper","action":"close","force":0.75}]
wait for 3.5 seconds [{"op":"wait","seconds":3.5}]
set velocity to 0.75 m/s [{"op":"speed","velocity":0.75}]
halt all motion [{"op":"stop"}]
upon sensor_trip return to home position [{"op":"safety","cond":"sensor_trip","then":[{"op":"home"}]}]

Temporal / Loop Commands

Natural Language Input Structured Output
repeat 5 times: move arm [{"op":"repeat","times":5,"body":[...]}]
keep doing move arm until obstacle [{"op":"repeat_until","cond":"obstacle","body":[...]}]
run control loop at 100Hz for 2.5 seconds [{"op":"control_loop","frequency_hz":100,"duration_s":2.5,"body":[...]}]
every 0.5s do rotate joints for 4 steps [{"op":"timed_seq","interval_s":0.5,"count":4,"body":[...]}]
simultaneously move arm and set speed [{"op":"parallel","branches":[[...],[...]]}]

Complex Sequences (Multi-step planning)

Input:  pick up the red_box at 0.5 0.5 0.0 and place it at -0.5 1.0 0.0

Output: [
  {"op":"move","x":0.5,"y":0.5,"z":0.0},
  {"op":"gripper","action":"close"},
  {"op":"move","x":0.5,"y":0.5,"z":0.2},
  {"op":"move","x":-0.5,"y":1.0,"z":0.2},
  {"op":"move","x":-0.5,"y":1.0,"z":0.0},
  {"op":"gripper","action":"open"},
  {"op":"move","x":-0.5,"y":1.0,"z":0.2}
]

Supported Operations

Category Operations
Motion move, joint_move, move_tcp, move_joint, home, trajectory
End Effector gripper, tool, get_joint_values
Control Flow wait, safety, stop, repeat, repeat_until
Temporal timed_seq, control_loop, parallel, state_transition

Model Details

Property Value
Architecture SparseMind (decoder-only, sparse attention + sparse FFN + dynamic neuron typing)
Parameters 10,347,395 (~10.3 M)
Hidden size / Layers / Heads 256 / 6 / 8
Context length 384 tokens
Tokenizer In-house domain-specific BPE, vocab 3,000, atomic numerical tokens
Precision FP32
Model size 39.6 MB

Training Methodology

Foros is trained on a hybrid corpus combining:

  • Programmatic synthetic data covering all supported operations, with paraphrastic variations (formal, informal, imperative tones), numerical precision variants, and compositional sequences of varying depth.
  • Curated production logs β€” anonymized real-world prompts collected from deployed instances, with manually verified ground-truth JSON targets.
  • Iterative refinement β€” successive versions (v5.4 β†’ v5.9 β†’ v5.10) integrate fixes derived from systematic failure analysis on the held-out benchmark.

Training is conducted from scratch (no pre-trained checkpoint) on a single T4 GPU in approximately 4 hours.

Detailed corpus composition, generator weights, and hyperparameter schedules are proprietary to AMEFORGE.


Known Limitations

  • Typo robustness β€” Tier 3 sits at 43.3% exact match. Severely mangled tokens (e.g., mvoe instead of move) can degrade numerical extraction. A typo-aware fine-tune is planned for v5.11.
  • Relative motion β€” Foros operates on absolute coordinates. Prompts like move left by 20 cm are out of domain and should be resolved by an upstream natural-language pre-processor that converts them to absolute positions.
  • Open-ended planning β€” Foros is a structured translator, not a planner. For multi-step reasoning beyond literal sequencing, pair it with an upstream planner.
  • Numerical fidelity in low-confidence contexts β€” when the prompt vocabulary is unfamiliar, the model may default to in-distribution coordinate values. For coordinate-critical operations in production, we recommend a lightweight regex post-processor that re-injects explicit numerical values from the prompt as a safety net.

Local Inference

import os
import torch
import sentencepiece as spm
from huggingface_hub import hf_hub_download

# Download model weights (public)
model_file = hf_hub_download(repo_id="AMFORGE/foros", filename="foros.pt")

# Download tokenizer (gated β€” set HF_TOKEN environment variable)
tok_file = hf_hub_download(
    repo_id="AMFORGE/foros_tok",
    filename="sparsforos_tokenizer.model",
    token=os.environ.get("HF_TOKEN"),
)

# Tokenizer
sp = spm.SentencePieceProcessor()
sp.Load(tok_file)

# Model β€” requires the SparseMind reference implementation
# (available with the tokenizer via AMEFORGE on request)
from sparsemind_robotics_train import SparseMind, Config

ckpt = torch.load(model_file, map_location="cpu", weights_only=False)
cfg = Config(**{k: v for k, v in ckpt["config"].items()
                if k in Config.__dataclass_fields__})
model = SparseMind(cfg)
model.load_state_dict(ckpt["model"])
model.eval()

# Inference β€” greedy decoding recommended for production
prompt = "move to x=0.5 y=-1.2 z=0.8 =>"
input_ids = torch.tensor([sp.EncodeAsIds(prompt)])
out_ids = model.generate(input_ids, max_new=128, temp=1.0, top_k=1)
result = sp.DecodeIds(out_ids[0, input_ids.shape[1]:].tolist())
print(result)
# [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]

Citation

@misc{foros_robotics_v5_10,
  title  = {Foros v5.10: An On-Device Instruction-to-JSON Engine for Robotics},
  author = {AMEFORGE},
  year   = {2026},
  note   = {Built on the SparseMind architecture.
            https://huggingface.co/AMFORGE/foros}
}

License & Contact