openvla-micro / README_github.md

Initial upload: base + distill checkpoints, model code, train_shim.py

dd9b4af verified 9 days ago

2.03 kB

	# OpenVLA-Micro

	Small-vision VLA for CPU robot deployment — trained on LIBERO-90.

	This repo is structured so it can be used as a plain source checkout, a Python package, or a Hugging Face model/code bundle. Large weight files are meant to stay out of GitHub history and be hosted separately when needed.

	\| Component \| Detail \|
	\|-----------\|--------\|
	\| Vision \| DINOv2-S (384d, 256 patches) + SigLIP-B/16 (768d, 196 patches) \|
	\| Projector \| ShimMLP(384→2048→8704) + ShimMLP(768→2048→8704) → Concat → Linear(8704→896) → GELU → Linear(896→896) \|
	\| LLM \| Qwen2.5-0.5B (896 hidden, 151k vocab, 256 extra tokens) \|
	\| Action \| 7-DoF, discretized into 256 bins per dim, minmax de-normalization \|
	\| Trainable \| 38.1M params (shim MLPs + LoRA rank 8 on projector) \|
	\| Frozen \| DINOv2, SigLIP, Qwen2.5 (all layers + lm_head + embed_tokens) \|
	\| Training \| 5000 steps, batch 64, LR 2e-4 w/ 200-step warmup → cosine to 1e-5 \|

	## Inference

	Install as a package:

	```bash
	pip install -e .
	```

	Run the CLI:

	```bash
	openvla-micro --checkpoint openvla-micro-merged.pt --image demo.jpg "pick up the red block"
	```

	```python
	from PIL import Image
	from modeling_openvla_micro import OpenVLAMicro

	model = OpenVLAMicro.from_pretrained("openvla-micro-merged.pt", device="cpu")
	model.eval()

	image = Image.open("demo.jpg").convert("RGB")
	action = model.predict_action(image, "pick up the red block")
	print(action) # [dx, dy, dz, droll, dpitch, dyaw, gripper]
	```

	The checkpoint argument can also be a Hugging Face repo ID if that repo contains `openvla-micro-merged.pt` or `openvla-micro-distill.pt`.

	## Data

	Checkpoint includes normalization statistics for the following dataset:

	- `libero_90`: 7-DoF end-effector deltas, 256-bin tokenization

	## Citation

	Built on [openvla-mini](https://github.com/openvla/openvla-mini) and [MiniVLA](https://github.com/rail-berkeley/TinyVLA).

	```
	@misc{openvla-micro-2026,
	author = {},
	title = {OpenVLA-Micro: Small-vision VLA for CPU Robot Deployment},
	year = {2026},
	}
	```