Instructions to use Imperius/llm-tank with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Imperius/llm-tank with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Imperius/llm-tank")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Imperius/llm-tank")
model = AutoModelForCausalLM.from_pretrained("Imperius/llm-tank")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Imperius/llm-tank with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Imperius/llm-tank"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Imperius/llm-tank

SGLang

How to use Imperius/llm-tank with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Imperius/llm-tank" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Imperius/llm-tank" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Imperius/llm-tank",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Imperius/llm-tank with Docker Model Runner:
```
docker model run hf.co/Imperius/llm-tank
```

llm-tank / README.md

Imperius

Update README.md

9430ade verified 9 days ago

preview code

raw

history blame contribute delete

4.53 kB

	---
	license: gemma
	base_model: unsloth/gemma-3-270m-it
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- robotics
	- text-to-json
	- instruction-following
	- mujoco
	- gemma3
	library_name: transformers
	---

	# LLM-Tank — Gemma-3 270M → robot JSON

	Source-code: https://codeberg.org/imperius/llm-tank

	Fine-tuned Gemma-3 270M that translates **one free-form English
	instruction** for a tracked robot with a gripper arm into a strict JSON
	command list, executed in a MuJoCo simulation.

	Full pipeline: `text → this model → valid JSON → controller → robot
	drives / grasps`. Code & sim: see the source repository.

	![LLM-Tank demo](demo.gif)

	## What it outputs

	A single JSON object `{"commands": [ ... ]}`. Actions:

	- `move` — `direction` (forward\|backward), `distance_m`, `speed?`
	- `turn` — `direction` (left\|right), `angle_deg`, `speed?`
	- `stop`, `wait` — `duration_s`
	- `grasp` / `release` — optional `cell` ∈
	`front\|front_left\|front_right\|left\|right` (discrete, relative to the
	robot; IK is solved by the controller, not the model)
	- out-of-scope / nonsense → `{"commands": []}`

	The model emits no coordinates — only discrete actions/enums (this
	keeps generation reliable and schema-checkable).

	## Required input format (IMPORTANT)

	The model was trained `train == infer` with a **fixed short system
	prompt** folded with the instruction into ONE user turn. You must use
	exactly this:

	```python
	import json
	from transformers import AutoModelForCausalLM, AutoTokenizer

	SYSTEM = ("You translate ONE English instruction for a tracked robot "
	"with a gripper arm into a single JSON object "
	'{"commands":[...]} using actions: move, turn, stop, wait, '
	"grasp, release. Output ONLY the JSON object, no prose, no "
	'markdown. If the instruction is out of scope or nonsense, '
	'output {"commands": []}.')

	tok = AutoTokenizer.from_pretrained("PATH_OR_REPO")
	model = AutoModelForCausalLM.from_pretrained("PATH_OR_REPO",
	torch_dtype="auto",
	device_map="auto")

	def translate(instruction: str) -> dict:
	user = SYSTEM + "\n\n---\nINSTRUCTION: " + instruction.strip()
	enc = tok.apply_chat_template(
	[{"role": "user", "content": user}],
	tokenize=True, add_generation_prompt=True,
	return_dict=True, return_tensors="pt").to(model.device)
	out = model.generate(**enc, max_new_tokens=160, do_sample=False)
	txt = tok.decode(out[0][enc["input_ids"].shape[1]:],
	skip_special_tokens=True)
	i, j = txt.find("{"), txt.rfind("}")
	try:
	return json.loads(txt[i:j + 1])
	except Exception:
	return {"commands": []} # safe fallback

	print(translate("go forward 2 meters then turn left"))
	# {"commands": [{"action": "move", "direction": "forward",
	# "distance_m": 2.0}, {"action": "turn", "direction": "left",
	# "angle_deg": 90}]}
	print(translate("pick it up")) # {"commands": [{"action": "grasp"}]}
	print(translate("make me a coffee"))# {"commands": []}
	```

	Greedy decoding (`do_sample=False`). The model is ~99% schema-valid
	without constrained decoding; always keep the safe fallback.

	## Metrics (held-out val, 352 examples: locomotion + manipulation + OOD)

	\| metric \| value \|
	\| --- \| --- \|
	\| schema_valid_rate \| 0.991 \|
	\| exact_match_rate \| 0.943 \|
	\| action_seq_accuracy \| 0.980 \|
	\| ood_f1 \| 0.857 \|
	\| task_success (MuJoCo, 40) \| 0.975 \|

	## Training

	Full fine-tuning (not LoRA) of `unsloth/gemma-3-270m-it` on ~3.5k
	synthetic instruction→JSON pairs (generated with 120B models, validated
	against a JSON Schema). fp32, Kaggle T4. Two phases: locomotion, then
	+ arm (grasp/release). Details in the source repo (`docs/`).

	## Demo

	`demo.mp4` (in this repo) — ~1 min, two panes: left = command + model
	JSON output, right = the robot acting in MuJoCo (real model + real
	physics, not staged).

	## Limitations

	- No perception: the model can't target objects by name/color, only by
	discrete relative `cell`. Object resolution is spatial (controller
	grabs the nearest graspable body in the chosen cell).
	- English only. Single fixed gripper, minimal custom arm.
	- Designed for the accompanying controller/sim; raw JSON is meaningless
	without it.

	## License

	Weights are a derivative of Google Gemma-3 — use is governed by the
	[Gemma Terms of Use](https://ai.google.dev/gemma/terms). Accompanying
	code is under its own license (see the source repository).