Octopus

Tuwhy 's Collections

updated Feb 9

RL checkpoints of Octopus-8B and baselines of paper: Learning Self-Correction in Vision–Language Models via Rollout Augmentation