Text Generation
Transformers
Safetensors
English
gemma3_text
robotics
text-to-json
instruction-following
mujoco
gemma3
conversational
text-generation-inference
Instructions to use Imperius/llm-tank with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Imperius/llm-tank with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Imperius/llm-tank") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Imperius/llm-tank") model = AutoModelForCausalLM.from_pretrained("Imperius/llm-tank") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Imperius/llm-tank with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Imperius/llm-tank" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Imperius/llm-tank", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Imperius/llm-tank
- SGLang
How to use Imperius/llm-tank with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Imperius/llm-tank" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Imperius/llm-tank", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Imperius/llm-tank" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Imperius/llm-tank", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Imperius/llm-tank with Docker Model Runner:
docker model run hf.co/Imperius/llm-tank
| license: gemma | |
| base_model: unsloth/gemma-3-270m-it | |
| language: | |
| - en | |
| pipeline_tag: text-generation | |
| tags: | |
| - robotics | |
| - text-to-json | |
| - instruction-following | |
| - mujoco | |
| - gemma3 | |
| library_name: transformers | |
| # LLM-Tank — Gemma-3 270M → robot JSON | |
| Source-code: https://codeberg.org/imperius/llm-tank | |
| Fine-tuned **Gemma-3 270M** that translates **one free-form English | |
| instruction** for a tracked robot with a gripper arm into a strict JSON | |
| command list, executed in a **MuJoCo** simulation. | |
| Full pipeline: `text → this model → valid JSON → controller → robot | |
| drives / grasps`. Code & sim: see the source repository. | |
|  | |
| ## What it outputs | |
| A single JSON object `{"commands": [ ... ]}`. Actions: | |
| - `move` — `direction` (forward|backward), `distance_m`, `speed?` | |
| - `turn` — `direction` (left|right), `angle_deg`, `speed?` | |
| - `stop`, `wait` — `duration_s` | |
| - `grasp` / `release` — optional `cell` ∈ | |
| `front|front_left|front_right|left|right` (discrete, relative to the | |
| robot; IK is solved by the controller, **not** the model) | |
| - out-of-scope / nonsense → `{"commands": []}` | |
| The model emits **no coordinates** — only discrete actions/enums (this | |
| keeps generation reliable and schema-checkable). | |
| ## Required input format (IMPORTANT) | |
| The model was trained `train == infer` with a **fixed short system | |
| prompt** folded with the instruction into ONE user turn. You must use | |
| exactly this: | |
| ```python | |
| import json | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| SYSTEM = ("You translate ONE English instruction for a tracked robot " | |
| "with a gripper arm into a single JSON object " | |
| '{"commands":[...]} using actions: move, turn, stop, wait, ' | |
| "grasp, release. Output ONLY the JSON object, no prose, no " | |
| 'markdown. If the instruction is out of scope or nonsense, ' | |
| 'output {"commands": []}.') | |
| tok = AutoTokenizer.from_pretrained("PATH_OR_REPO") | |
| model = AutoModelForCausalLM.from_pretrained("PATH_OR_REPO", | |
| torch_dtype="auto", | |
| device_map="auto") | |
| def translate(instruction: str) -> dict: | |
| user = SYSTEM + "\n\n---\nINSTRUCTION: " + instruction.strip() | |
| enc = tok.apply_chat_template( | |
| [{"role": "user", "content": user}], | |
| tokenize=True, add_generation_prompt=True, | |
| return_dict=True, return_tensors="pt").to(model.device) | |
| out = model.generate(**enc, max_new_tokens=160, do_sample=False) | |
| txt = tok.decode(out[0][enc["input_ids"].shape[1]:], | |
| skip_special_tokens=True) | |
| i, j = txt.find("{"), txt.rfind("}") | |
| try: | |
| return json.loads(txt[i:j + 1]) | |
| except Exception: | |
| return {"commands": []} # safe fallback | |
| print(translate("go forward 2 meters then turn left")) | |
| # {"commands": [{"action": "move", "direction": "forward", | |
| # "distance_m": 2.0}, {"action": "turn", "direction": "left", | |
| # "angle_deg": 90}]} | |
| print(translate("pick it up")) # {"commands": [{"action": "grasp"}]} | |
| print(translate("make me a coffee"))# {"commands": []} | |
| ``` | |
| Greedy decoding (`do_sample=False`). The model is ~99% schema-valid | |
| without constrained decoding; always keep the safe fallback. | |
| ## Metrics (held-out val, 352 examples: locomotion + manipulation + OOD) | |
| | metric | value | | |
| | --- | --- | | |
| | schema_valid_rate | 0.991 | | |
| | exact_match_rate | 0.943 | | |
| | action_seq_accuracy | 0.980 | | |
| | ood_f1 | 0.857 | | |
| | task_success (MuJoCo, 40) | 0.975 | | |
| ## Training | |
| Full fine-tuning (not LoRA) of `unsloth/gemma-3-270m-it` on ~3.5k | |
| synthetic instruction→JSON pairs (generated with 120B models, validated | |
| against a JSON Schema). fp32, Kaggle T4. Two phases: locomotion, then | |
| + arm (grasp/release). Details in the source repo (`docs/`). | |
| ## Demo | |
| `demo.mp4` (in this repo) — ~1 min, two panes: left = command + model | |
| JSON output, right = the robot acting in MuJoCo (real model + real | |
| physics, not staged). | |
| ## Limitations | |
| - No perception: the model can't target objects by name/color, only by | |
| discrete relative `cell`. Object resolution is spatial (controller | |
| grabs the nearest graspable body in the chosen cell). | |
| - English only. Single fixed gripper, minimal custom arm. | |
| - Designed for the accompanying controller/sim; raw JSON is meaningless | |
| without it. | |
| ## License | |
| Weights are a derivative of Google **Gemma-3** — use is governed by the | |
| [Gemma Terms of Use](https://ai.google.dev/gemma/terms). Accompanying | |
| code is under its own license (see the source repository). | |