Instructions to use Imperius/llm-tank with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Imperius/llm-tank with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Imperius/llm-tank")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Imperius/llm-tank")
model = AutoModelForCausalLM.from_pretrained("Imperius/llm-tank")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Imperius/llm-tank with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Imperius/llm-tank"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Imperius/llm-tank

SGLang

How to use Imperius/llm-tank with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Imperius/llm-tank" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Imperius/llm-tank" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Imperius/llm-tank with Docker Model Runner:
```
docker model run hf.co/Imperius/llm-tank
```

llm-tank / README.md

Imperius

Update README.md

9430ade verified 8 days ago

preview code

raw

history blame contribute delete

4.53 kB

metadata

license: gemma
base_model: unsloth/gemma-3-270m-it
language:
  - en
pipeline_tag: text-generation
tags:
  - robotics
  - text-to-json
  - instruction-following
  - mujoco
  - gemma3
library_name: transformers

LLM-Tank — Gemma-3 270M → robot JSON

Source-code: https://codeberg.org/imperius/llm-tank

Fine-tuned Gemma-3 270M that translates one free-form English instruction for a tracked robot with a gripper arm into a strict JSON command list, executed in a MuJoCo simulation.

Full pipeline: text → this model → valid JSON → controller → robot drives / grasps. Code & sim: see the source repository.

What it outputs

A single JSON object {"commands": [ ... ]}. Actions:

move — direction (forward|backward), distance_m, speed?
turn — direction (left|right), angle_deg, speed?
stop, wait — duration_s
grasp / release — optional cell ∈ front|front_left|front_right|left|right (discrete, relative to the robot; IK is solved by the controller, not the model)
out-of-scope / nonsense → {"commands": []}

The model emits no coordinates — only discrete actions/enums (this keeps generation reliable and schema-checkable).

Required input format (IMPORTANT)

The model was trained train == infer with a fixed short system prompt folded with the instruction into ONE user turn. You must use exactly this:

import json
from transformers import AutoModelForCausalLM, AutoTokenizer

SYSTEM = ("You translate ONE English instruction for a tracked robot "
          "with a gripper arm into a single JSON object "
          '{"commands":[...]} using actions: move, turn, stop, wait, '
          "grasp, release. Output ONLY the JSON object, no prose, no "
          'markdown. If the instruction is out of scope or nonsense, '
          'output {"commands": []}.')

tok = AutoTokenizer.from_pretrained("PATH_OR_REPO")
model = AutoModelForCausalLM.from_pretrained("PATH_OR_REPO",
                                             torch_dtype="auto",
                                             device_map="auto")

def translate(instruction: str) -> dict:
    user = SYSTEM + "\n\n---\nINSTRUCTION: " + instruction.strip()
    enc = tok.apply_chat_template(
        [{"role": "user", "content": user}],
        tokenize=True, add_generation_prompt=True,
        return_dict=True, return_tensors="pt").to(model.device)
    out = model.generate(**enc, max_new_tokens=160, do_sample=False)
    txt = tok.decode(out[0][enc["input_ids"].shape[1]:],
                     skip_special_tokens=True)
    i, j = txt.find("{"), txt.rfind("}")
    try:
        return json.loads(txt[i:j + 1])
    except Exception:
        return {"commands": []}  # safe fallback

print(translate("go forward 2 meters then turn left"))
# {"commands": [{"action": "move", "direction": "forward",
#   "distance_m": 2.0}, {"action": "turn", "direction": "left",
#   "angle_deg": 90}]}
print(translate("pick it up"))      # {"commands": [{"action": "grasp"}]}
print(translate("make me a coffee"))# {"commands": []}

Greedy decoding (do_sample=False). The model is ~99% schema-valid without constrained decoding; always keep the safe fallback.

Metrics (held-out val, 352 examples: locomotion + manipulation + OOD)

metric	value
schema_valid_rate	0.991
exact_match_rate	0.943
action_seq_accuracy	0.980
ood_f1	0.857
task_success (MuJoCo, 40)	0.975

Training

Full fine-tuning (not LoRA) of unsloth/gemma-3-270m-it on ~3.5k synthetic instruction→JSON pairs (generated with 120B models, validated against a JSON Schema). fp32, Kaggle T4. Two phases: locomotion, then

arm (grasp/release). Details in the source repo (docs/).

Demo

demo.mp4 (in this repo) — ~1 min, two panes: left = command + model JSON output, right = the robot acting in MuJoCo (real model + real physics, not staged).

Limitations

No perception: the model can't target objects by name/color, only by discrete relative cell. Object resolution is spatial (controller grabs the nearest graspable body in the chosen cell).
English only. Single fixed gripper, minimal custom arm.
Designed for the accompanying controller/sim; raw JSON is meaningless without it.

License

Weights are a derivative of Google Gemma-3 — use is governed by the Gemma Terms of Use. Accompanying code is under its own license (see the source repository).