Commit History

Update train_grpo1.py: set NUM_STEPS=50 and optimize logging steps
43be96e

ayhm23 commited on

Fix: Export JSON and IPYNB from LFS to regular Git storage
7f12fb7

ayhm23 commited on

Configure LFS for binary files and update Dockerfile
2741aa5

ayhm23 commited on

Enhance: Update GRPO training process with improved plotting and sentinel integration for 50-step runs
825ae74

ayhm23 commited on

Fix: 50-step GRPO training with plotting & sentinel
8dba71d

ayhm23 commited on

fix: stabilize 5-step smoke logging and completion
22606ee

ayhm23 commited on

chore: keep smoke run at 5 steps
ea079ff

ayhm23 commited on

fix: harden smoke-run plotting for HF space
4a808b4

ayhm23 commited on

chore: reduce smoke run to 5 steps
b49b9bf

ayhm23 commited on

fix: make max_steps the sole GRPO stop condition
1ec59fb

ayhm23 commited on

chore: switch to 50-step smoke test run
dc95f89

ayhm23 commited on

fix: harden HF cache paths and add Space push helper
0ae3dba

ayhm23 commited on

modified ipynb train code
dcd3fb7

Puskara commited on

added sft scenarios
cc0e7f6

Puskara commited on

Update README.md
fc24c2e
unverified

Sanyam Verma commited on

fix: add ForceStopCallback to ensure training ends at step 300
d87e253

ayhm23 commited on

chore: finalize environment and training configuration
88c3088

ayhm23 commited on

feat: add step tracking and stop-check safeguards for Phase 4
67ca245

ayhm23 commited on

modified context
d18a871

Puskara commited on

fix: correct accelerate version to 1.13.0
b348c02

ayhm23 commited on

fix: downgrade TRL to 0.14.0 as requested
8b298f4

ayhm23 commited on

fix: upgrade to TRL 0.14.3 and restore top-level imports
fcd8b86

ayhm23 commited on

fix: use absolute python path and stable pytorch base image
305520e

ayhm23 commited on

fix: harden Docker build with python3 -m pip and import validation
f197867

ayhm23 commited on

fix: force trl upgrade and use direct sub-module imports
3219978

ayhm23 commited on

fix: use python3 in Dockerfile CMD for HF compatibility
cc850c1

ayhm23 commited on

fix: revert Dockerfile to latest and pin transformers to 4.46.0
230899c

ayhm23 commited on

chore: pin HF base image to 4.47.1 for total environment parity
6118bc1

ayhm23 commited on

fix: upgrade transformers to 4.47.1 to satisfy TRL 0.12.1 requirements
07a34b0

ayhm23 commited on

fix: upgrade accelerate and transformers to satisfy TRL 0.12.1
62e4067

ayhm23 commited on

fix: use HF base image and TRL 0.12.1 for GRPOConfig support
2a2707c

ayhm23 commited on

fix: revert to even more stable library versions and add wandb/matplotlib
001c98e

ayhm23 commited on

fix: pin trl to 0.11.4 and transformers to 4.45.2 to resolve FSDP error
b29b7db

ayhm23 commited on

fix: remove torch/transformers from requirements to avoid Docker conflict
46f600c

ayhm23 commited on

fix: upgrade pytorch base image to 2.5.1 for TRL compatibility
36ba6cb

ayhm23 commited on

chore: set Dockerfile to training mode for HF Space deployment
42840cb

ayhm23 commited on

feat: prepare for HF Docker training with Hub integration
c289333

ayhm23 commited on

chore: update .gitignore to exclude results directory
e4e3226

ayhm23 commited on

feat: optimize GRPO training with stability guards and reward curve plotting
48ced0d

ayhm23 commited on

added legitimate claim reward
92e7f5c

Puskara commited on

fix: clean up deps for HF Docker build - server-only pyproject, split requirements
7eb8978

sanyamvermaa commited on

fix: add missing requirements-server.txt for Docker build
9534763

sanyamvermaa commited on

added ipynb file for training grpo
08a62a0

Puskara commited on

feat: implement generalization testing pipeline and report generation for held-out evaluation scenarios
ae22694

ayhm23 commited on

docs: refactor context.md for project clarity and add baseline/phase 3 documentation
6a045c8

ayhm23 commited on

Refactor: Optimized for HF deployment, improved reward stability, and cleaned up workspace.
4169567

ayhm23 commited on

added multiline tier 1 scenarios
0ab9228

Puskara commited on

Fix 0.52 plateau, implement multi-turn eval, and prepare HF deployment
2ad21f7

ayhm23 commited on

fixed verifier bugs
ce2278b

Puskara commited on

chore: remove unused configuration file
60eb0d6

ayhm23 commited on