Commit History

Add Implementation details section and surface YouTube demo URL up top in BLOG
75616fd

Krishna1107 commited on

Drop GRPO temp to 0.3, bump max_new_tokens to 2048, add inference smoke test
576dfc3

Krishna1107 Claude Opus 4.7 commited on

Add YouTube demo video link to README and BLOG
d681074

Krishna1107 Claude Opus 4.7 commited on

Reset env between retry attempts within a single episode
e67a63a

Krishna1107 Claude Opus 4.7 commited on

Add inference-time retry loop with feedback for self-correction
10feced

Krishna1107 Claude Opus 4.7 commited on

Document demonstration experiment finding; finalize README links
1633162

Krishna1107 Claude Opus 4.7 commited on

Fix verification gaps: add README links, rename blog, fix Colab badge, set author
e095963

Krishna1107 Claude Opus 4.7 commited on

Add in-context demonstration learning support
968797f

Krishna1107 Claude Opus 4.7 commited on

Document Phase 1 results and Phase 2 self-play roadmap
eb6bdcf

Krishna1107 Claude Opus 4.7 commited on

Lower GRPO rollout temperature to 0.7 for more deterministic test code generation
939d0ba

Krishna1107 Claude Opus 4.7 commited on

Fix GRPO reward routing: correct seed lookup + markdown fence stripping
c923ecc

Krishna1107 Claude Opus 4.7 commited on

Fix make_plots.py to accept --wandb-run-id separately from --training-log-json
640263f

Krishna1107 Claude Opus 4.7 commited on

Retry torch.cuda.is_available() in fresh python procs; NVML ready != cuInit ready
27e2f37

Krishna1107 commited on

Rewrite run_hf_job_7b.sh end-to-end: nvidia-smi GPU poll, clean phase structure
8e61ac4

Krishna1107 commited on

Drop venv approach; force-reinstall torch stack over base image to actually upgrade past 2.5.1
1110d0a

Krishna1107 commited on

Phase 0: build fresh venv + restore CUDA poll + restore upfront huggingface_hub install
c4df898

Krishna1107 commited on

Install torchaudio alongside torch/torchvision to clear pinned-base-image conflict
7ab4e76

Krishna1107 commited on

Drop version pins on training extras and torch/torchvision install
241a3bd

Krishna1107 commited on

Hard reset HF Job env: install torch+torchvision together at fixed versions
bd78955

Krishna1107 commited on

Upgrade torchvision/torchaudio alongside torch to fix nms op registration
94a6b52

Krishna1107 commited on

Upgrade torch to 2.5+ in Phase 0; pin trl>=0.14 for GRPOTrainer
ce6481e

Krishna1107 commited on

Fix sys.path so train_grpo.py imports training.prompts correctly when run as script
4f5fbc5

Krishna1107 Claude Opus 4.7 commited on

Make 7B HF Job Phase 0 resilient: poll for CUDA readiness, install hf_hub up front for trap
27a9b93

Krishna1107 Claude Opus 4.7 commited on

Use --env instead of --secrets KEY=VALUE for WANDB_API_KEY
571af06

Krishna1107 Claude Opus 4.7 commited on

Fix wandb login in 7B HF Job: verify env var present, use CLI not Python API
cb93805

Krishna1107 Claude Opus 4.7 commited on

Add 7B variant of HF Job for stronger cold-start.
f249cf5

Krishna1107 Claude Opus 4.7 commited on

Add reduced-scope HF Job variant for time-constrained scenarios
51e8315

Krishna1107 Claude Opus 4.7 commited on

Prompt fix: include full module source + grounding rule + corpus example; skip baseline recompute when cached
c26549f

Krishna1107 Claude Opus 4.7 commited on

Bind env server to port 7860 for HF Spaces
29e10c9

Krishna1107 Claude Opus 4.7 commited on

Switch HF Job base image to pytorch:2.5.1-cuda12.4 to fix CUDA13 ABI bleed
12ecfb5

Krishna1107 Claude Opus 4.7 commited on

Fix torch/torchvision ABI mismatch in HF Job Phase 0
efb52e5

Krishna1107 Claude Opus 4.7 commited on

Fix GitHub username in HF Job scripts: jester1177 -> melohub-xbit
082986e

Krishna1107 Claude Opus 4.7 commited on

Update README.md
98aaaf0
verified

jester1177 commited on

Fix submit_hf_job.sh syntax: image as positional arg
900c737

Krishna1107 Claude Opus 4.7 commited on

Pre-submission HF Job patches: Python version compat + skip torch reinstall
b5d30d0

Krishna1107 Claude Opus 4.7 commited on

Add HF Job training pipeline: persistence-aware run script, judge-facing demo notebook, baseline JSON output
e01ee6d

Krishna1107 Claude Opus 4.7 commited on

Re-apply training prep work on top of validated Layers 1-5 base
1b83cd4

Krishna1107 commited on

Initial commit: MutantHunter — RL env for mutation-score-rewarded test generation
91487c9

Krishna1107 commited on