add colab notebook and updated the train and docker files and edited the requirements according to the codebase 2b9c8cb Bhaskar commited on 23 days ago
Round 2 Upgrade: Added GRPO train.py and vector-field reward shaping 51bb0d4 Bhaskar commited on 26 days ago