Update README.md

20ee058 verified about 7 hours ago

10.2 kB

license: apache-2.0
language:
  - en
tags:
  - robotics
  - instruction-following
  - structured-generation
  - text-to-json
  - ros
  - ros2
  - sparse-transformer
  - embedded-ai
  - on-device
  - temporal-control
  - control-loop
pipeline_tag: text-generation
inference: false

Foros Robotics Action Engine

Foros is an ultra-compact 10M parameter instruction-to-JSON model designed for low-latency, on-device robotics control. It translates plain-English robot commands — including temporal loops, timed sequences, and FSM transitions — directly into structured JSON arrays of operations compatible with ROS / ROS2 and major industrial robot controllers (URScript, KRL, RAPID, Fanuc, DRL).

Developed by AMEFORGE — https://huggingface.co/AMFORGE. Built on the in-house SparseMind architecture (sparse token attention, sparse channel FFN, dynamic neuron typing).

Current version: v5.10 — production-ready, deployed on CPU / Jetson / Raspberry Pi 4.

Benchmark Results (Held-Out, 142 Curated Examples)

Foros is evaluated on a held-out test suite of 142 hand-curated robotics commands spanning 5 difficulty tiers. None of these prompts appear in the training corpus. All measurements taken on Kaggle T4 GPU, greedy decoding.

Per-Tier Breakdown — v5.10

Tier	Description	N	Valid JSON	Op Correct	Exact Match
Tier 1	Paraphrase (novel templates)	32	100.0%	100.0%	100.0%
Tier 2	Informal (natural language)	29	100.0%	96.6%	93.1%
Tier 3	Typos & noise robustness	30	100.0%	80.0%	43.3%
Tier 4	Multi-step sequences	22	100.0%	100.0%	72.7%
Tier 5	Long chains & temporal loops	29	96.6%	96.6%	69.0%
Global (weighted)		142	99.3%	94.4%	76.1%

Version Trajectory

We report Exact Match on the held-out benchmark across successive iterations to document architectural and data improvements transparently:

Version	Held-Out Exact Match	Held-Out JSON Valid	Notable Change
v5.4 (baseline)	62.7%	~99%	Initial production deployment
v5.9	63.4%	100.0%	Numerical precision standardization, deterministic pick-and-place targets
v5.10	76.1%	99.3%	Refactored conditional templates, expanded informal/imperative vocabulary, integer-form numerical prompts

Head-to-Head Comparison (All Measured)

All baselines evaluated on the same 142-example held-out benchmark, same hardware (Kaggle T4 GPU), same scoring rubric, greedy decoding.

Model	Exact Match	Valid JSON	Op Correct	Latency (avg)	Size
🚀 Foros v5.10 — AMEFORGE	76.1%	99.3%	94.4%	508 ms	39.6 MB
Qwen2.5-1.5B-Instruct	28.0%	60.0%	44.0%	1{,}998 ms	2{,}944 MB
Qwen2.5-0.5B-Instruct	18.0%	46.0%	24.0%	3{,}766 ms	942 MB
TinyLlama-1.1B-Chat	6.0%	22.0%	10.0%	7{,}315 ms	2{,}098 MB
SmolLM2-360M-Instruct	0.0%	6.0%	2.0%	5{,}884 ms	690 MB

Key takeaways:

Foros reaches 76.1% exact match on held-out robotics commands. The best general-purpose small LM evaluated (Qwen2.5-1.5B, ~150× larger) reaches only 28.0% — Foros outperforms it by +48 percentage points.
The smallest comparable general-purpose LM (SmolLM2-360M, ~36× larger) reaches 0.0% exact match and only 6.0% valid JSON, indicating that general-purpose small models struggle even to produce syntactically valid output on this task.
4× lower latency than Qwen2.5-1.5B, 14× lower than TinyLlama-1.1B.
17× smaller than the smallest competitive baseline (SmolLM2-360M), 74× smaller than Qwen2.5-1.5B.
Runs on Raspberry Pi 4, Jetson Nano/Orin, or any embedded CPU. No GPU required for inference, no cloud dependency, no telemetry.

Latency profile — atomic commands (Tier 1–3) run at ~305 ms, compound sequences (Tier 4–5) at ~860 ms. The 508 ms average reflects the full benchmark distribution including long temporal loops.

What it does

Atomic Commands

Natural Language Input	Structured Output (ROS JSON)
`move to x=0.5 y=-1.2 z=0.8`	`[{"op":"move","x":0.5,"y":-1.2,"z":0.8}]`
`rotate joints to [0.0, 45.0, 90.0, 0.0, 0.0, 0.0]`	`[{"op":"joint_move","joints":[0.0,45.0,90.0,0.0,0.0,0.0]}]`
`close gripper with force 0.75`	`[{"op":"gripper","action":"close","force":0.75}]`
`wait for 3.5 seconds`	`[{"op":"wait","seconds":3.5}]`
`set velocity to 0.75 m/s`	`[{"op":"speed","velocity":0.75}]`
`halt all motion`	`[{"op":"stop"}]`
`upon sensor_trip return to home position`	`[{"op":"safety","cond":"sensor_trip","then":[{"op":"home"}]}]`

Temporal / Loop Commands

Natural Language Input	Structured Output
`repeat 5 times: move arm`	`[{"op":"repeat","times":5,"body":[...]}]`
`keep doing move arm until obstacle`	`[{"op":"repeat_until","cond":"obstacle","body":[...]}]`
`run control loop at 100Hz for 2.5 seconds`	`[{"op":"control_loop","frequency_hz":100,"duration_s":2.5,"body":[...]}]`
`every 0.5s do rotate joints for 4 steps`	`[{"op":"timed_seq","interval_s":0.5,"count":4,"body":[...]}]`
`simultaneously move arm and set speed`	`[{"op":"parallel","branches":[[...],[...]]}]`

Complex Sequences (Multi-step planning)

Input:  pick up the red_box at 0.5 0.5 0.0 and place it at -0.5 1.0 0.0

Output: [
  {"op":"move","x":0.5,"y":0.5,"z":0.0},
  {"op":"gripper","action":"close"},
  {"op":"move","x":0.5,"y":0.5,"z":0.2},
  {"op":"move","x":-0.5,"y":1.0,"z":0.2},
  {"op":"move","x":-0.5,"y":1.0,"z":0.0},
  {"op":"gripper","action":"open"},
  {"op":"move","x":-0.5,"y":1.0,"z":0.2}
]

Supported Operations

Category	Operations
Motion	`move`, `joint_move`, `move_tcp`, `move_joint`, `home`, `trajectory`
End Effector	`gripper`, `tool`, `get_joint_values`
Control Flow	`wait`, `safety`, `stop`, `repeat`, `repeat_until`
Temporal	`timed_seq`, `control_loop`, `parallel`, `state_transition`

Model Details

Property	Value
Architecture	SparseMind (decoder-only, sparse attention + sparse FFN + dynamic neuron typing)
Parameters	10,347,395 (~10.3 M)
Hidden size / Layers / Heads	256 / 6 / 8
Context length	384 tokens
Tokenizer	In-house domain-specific BPE, vocab 3,000, atomic numerical tokens
Precision	FP32
Model size	39.6 MB

Training Methodology

Foros is trained on a hybrid corpus combining:

Programmatic synthetic data covering all supported operations, with paraphrastic variations (formal, informal, imperative tones), numerical precision variants, and compositional sequences of varying depth.
Curated production logs — anonymized real-world prompts collected from deployed instances, with manually verified ground-truth JSON targets.
Iterative refinement — successive versions (v5.4 → v5.9 → v5.10) integrate fixes derived from systematic failure analysis on the held-out benchmark.

Training is conducted from scratch (no pre-trained checkpoint) on a single T4 GPU in approximately 4 hours.

Detailed corpus composition, generator weights, and hyperparameter schedules are proprietary to AMEFORGE.

Known Limitations

Typo robustness — Tier 3 sits at 43.3% exact match. Severely mangled tokens (e.g., mvoe instead of move) can degrade numerical extraction. A typo-aware fine-tune is planned for v5.11.
Relative motion — Foros operates on absolute coordinates. Prompts like move left by 20 cm are out of domain and should be resolved by an upstream natural-language pre-processor that converts them to absolute positions.
Open-ended planning — Foros is a structured translator, not a planner. For multi-step reasoning beyond literal sequencing, pair it with an upstream planner.
Numerical fidelity in low-confidence contexts — when the prompt vocabulary is unfamiliar, the model may default to in-distribution coordinate values. For coordinate-critical operations in production, we recommend a lightweight regex post-processor that re-injects explicit numerical values from the prompt as a safety net.

Local Inference

import os
import torch
import sentencepiece as spm
from huggingface_hub import hf_hub_download

# Download model weights (public)
model_file = hf_hub_download(repo_id="AMFORGE/foros", filename="foros.pt")

# Download tokenizer (gated — set HF_TOKEN environment variable)
tok_file = hf_hub_download(
    repo_id="AMFORGE/foros_tok",
    filename="sparsforos_tokenizer.model",
    token=os.environ.get("HF_TOKEN"),
)

# Tokenizer
sp = spm.SentencePieceProcessor()
sp.Load(tok_file)

# Model — requires the SparseMind reference implementation
# (available with the tokenizer via AMEFORGE on request)
from sparsemind_robotics_train import SparseMind, Config

ckpt = torch.load(model_file, map_location="cpu", weights_only=False)
cfg = Config(**{k: v for k, v in ckpt["config"].items()
                if k in Config.__dataclass_fields__})
model = SparseMind(cfg)
model.load_state_dict(ckpt["model"])
model.eval()

# Inference — greedy decoding recommended for production
prompt = "move to x=0.5 y=-1.2 z=0.8 =>"
input_ids = torch.tensor([sp.EncodeAsIds(prompt)])
out_ids = model.generate(input_ids, max_new=128, temp=1.0, top_k=1)
result = sp.DecodeIds(out_ids[0, input_ids.shape[1]:].tolist())
print(result)
# [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]

Citation

@misc{foros_robotics_v5_10,
  title  = {Foros v5.10: An On-Device Instruction-to-JSON Engine for Robotics},
  author = {AMEFORGE},
  year   = {2026},
  note   = {Built on the SparseMind architecture.
            https://huggingface.co/AMFORGE/foros}
}

License & Contact

Model weights: Apache 2.0
Tokenizer: gated access — contact AMEFORGE
Inquiries: https://huggingface.co/AMFORGE