OpenTouch VT2P Encoder
This repository contains the OpenTouch native retrieval encoder trained for the vt2p task:
- Task: visual + tactile <-> pose
- Model config:
OpenTouch-DINOv3-B16-AllModalities - Sequence length: 20
- Stride: 10
- Training data: OpenTouch official retrieval HF dataset converted locally at
datasets/opentouch_official_retrieval_hf - Initialization: warm-started from
LeoJiangOR/opentouch-vp2t-encoder-best/ local VP2Tepoch_280.pt - Released checkpoint:
epoch_300.pt
The best validation metrics in the run were observed at epoch 295, but checkpoints were saved every 10 epochs, so the released checkpoint is the final saved epoch 300 checkpoint.
Metrics
First full-validation evaluation, epoch 5:
| Direction | R@1 | R@5 | R@10 | mAP |
|---|---|---|---|---|
| visual+tactile -> pose | 0.0097 | 0.0456 | 0.0794 | 0.0375 |
| pose -> visual+tactile | 0.0094 | 0.0429 | 0.0690 | 0.0343 |
Best observed validation, epoch 295:
| Direction | R@1 | R@5 | R@10 | mAP |
|---|---|---|---|---|
| visual+tactile -> pose | 0.0466 | 0.1722 | 0.2553 | 0.1164 |
| pose -> visual+tactile | 0.0476 | 0.1601 | 0.2405 | 0.1131 |
Final saved checkpoint, epoch 300:
| Direction | R@1 | R@5 | R@10 | mAP |
|---|---|---|---|---|
| visual+tactile -> pose | 0.0469 | 0.1648 | 0.2506 | 0.1137 |
| pose -> visual+tactile | 0.0446 | 0.1554 | 0.2439 | 0.1089 |
Files
epoch_300.pt: released final checkpointconfig/OpenTouch-DINOv3-B16-AllModalities.json: model configresults/results.jsonl: full validation historyparams.txt: training hyperparameters
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support