Add Implementation details section and surface YouTube demo URL up top in BLOG 75616fd Krishna1107 commited on Apr 26
Drop GRPO temp to 0.3, bump max_new_tokens to 2048, add inference smoke test 576dfc3 Krishna1107 Claude Opus 4.7 commited on Apr 26
Add YouTube demo video link to README and BLOG d681074 Krishna1107 Claude Opus 4.7 commited on Apr 26
Reset env between retry attempts within a single episode e67a63a Krishna1107 Claude Opus 4.7 commited on Apr 26
Add inference-time retry loop with feedback for self-correction 10feced Krishna1107 Claude Opus 4.7 commited on Apr 26
Document demonstration experiment finding; finalize README links 1633162 Krishna1107 Claude Opus 4.7 commited on Apr 26
Fix verification gaps: add README links, rename blog, fix Colab badge, set author e095963 Krishna1107 Claude Opus 4.7 commited on Apr 26
Document Phase 1 results and Phase 2 self-play roadmap eb6bdcf Krishna1107 Claude Opus 4.7 commited on Apr 26
Lower GRPO rollout temperature to 0.7 for more deterministic test code generation 939d0ba Krishna1107 Claude Opus 4.7 commited on Apr 26
Fix GRPO reward routing: correct seed lookup + markdown fence stripping c923ecc Krishna1107 Claude Opus 4.7 commited on Apr 26
Fix make_plots.py to accept --wandb-run-id separately from --training-log-json 640263f Krishna1107 Claude Opus 4.7 commited on Apr 26
Retry torch.cuda.is_available() in fresh python procs; NVML ready != cuInit ready 27e2f37 Krishna1107 commited on Apr 26
Rewrite run_hf_job_7b.sh end-to-end: nvidia-smi GPU poll, clean phase structure 8e61ac4 Krishna1107 commited on Apr 26
Drop venv approach; force-reinstall torch stack over base image to actually upgrade past 2.5.1 1110d0a Krishna1107 commited on Apr 26
Phase 0: build fresh venv + restore CUDA poll + restore upfront huggingface_hub install c4df898 Krishna1107 commited on Apr 26
Install torchaudio alongside torch/torchvision to clear pinned-base-image conflict 7ab4e76 Krishna1107 commited on Apr 26
Drop version pins on training extras and torch/torchvision install 241a3bd Krishna1107 commited on Apr 26
Hard reset HF Job env: install torch+torchvision together at fixed versions bd78955 Krishna1107 commited on Apr 26
Upgrade torchvision/torchaudio alongside torch to fix nms op registration 94a6b52 Krishna1107 commited on Apr 26
Upgrade torch to 2.5+ in Phase 0; pin trl>=0.14 for GRPOTrainer ce6481e Krishna1107 commited on Apr 26
Fix sys.path so train_grpo.py imports training.prompts correctly when run as script 4f5fbc5 Krishna1107 Claude Opus 4.7 commited on Apr 26
Make 7B HF Job Phase 0 resilient: poll for CUDA readiness, install hf_hub up front for trap 27a9b93 Krishna1107 Claude Opus 4.7 commited on Apr 26
Use --env instead of --secrets KEY=VALUE for WANDB_API_KEY 571af06 Krishna1107 Claude Opus 4.7 commited on Apr 26
Fix wandb login in 7B HF Job: verify env var present, use CLI not Python API cb93805 Krishna1107 Claude Opus 4.7 commited on Apr 26
Add 7B variant of HF Job for stronger cold-start. f249cf5 Krishna1107 Claude Opus 4.7 commited on Apr 26
Add reduced-scope HF Job variant for time-constrained scenarios 51e8315 Krishna1107 Claude Opus 4.7 commited on Apr 26
Prompt fix: include full module source + grounding rule + corpus example; skip baseline recompute when cached c26549f Krishna1107 Claude Opus 4.7 commited on Apr 26
Switch HF Job base image to pytorch:2.5.1-cuda12.4 to fix CUDA13 ABI bleed 12ecfb5 Krishna1107 Claude Opus 4.7 commited on Apr 25
Fix torch/torchvision ABI mismatch in HF Job Phase 0 efb52e5 Krishna1107 Claude Opus 4.7 commited on Apr 25
Fix GitHub username in HF Job scripts: jester1177 -> melohub-xbit 082986e Krishna1107 Claude Opus 4.7 commited on Apr 25
Fix submit_hf_job.sh syntax: image as positional arg 900c737 Krishna1107 Claude Opus 4.7 commited on Apr 25
Pre-submission HF Job patches: Python version compat + skip torch reinstall b5d30d0 Krishna1107 Claude Opus 4.7 commited on Apr 25
Add HF Job training pipeline: persistence-aware run script, judge-facing demo notebook, baseline JSON output e01ee6d Krishna1107 Claude Opus 4.7 commited on Apr 25
Re-apply training prep work on top of validated Layers 1-5 base 1b83cd4 Krishna1107 commited on Apr 25
Initial commit: MutantHunter — RL env for mutation-score-rewarded test generation 91487c9 Krishna1107 commited on Apr 25