remove duplicate video links from README links table and blog footer (kept the top hero only) 8408da1 Jeeevan11 commited on Apr 26
ship submission: judge-aligned README, real training curves from 2000-step run, blog, scripts ce00c50 Jeeevan11 commited on Apr 26
optimize for hackathon time budget: 256 tokens, 200-step checkpoints bd2780e Jeeevan11 commited on Apr 25
normalize agent C decision before sending to env (defaults + clamps) 2159f38 Jeeevan11 commited on Apr 25
upgrade torch to 2.5+ for unsloth_zoo compatibility; use ampere-torch250 wheel e1ae66a Jeeevan11 commited on Apr 25
import unsloth first; pin torch 2.4.0 for unsloth cpp ext compat 569dcfa Jeeevan11 commited on Apr 25
broaden TRL import catch; pin compatible transformers/trl/peft versions 69a91cc Jeeevan11 commited on Apr 25
expand dataset to 40 templates / 8 domains; fix injection edge cases; lazy unsloth import b532a41 Jeeevan11 commited on Apr 25
expand dataset: 40 templates, 8 domains, fixed injection edge cases 4a18ff6 Jeeevan11 commited on Apr 25
Merge branch 'main' of https://huggingface.co/spaces/testingaccc/conflict-arbitration-env c1fd4ec Jeeevan11 commited on Apr 25