Spaces:

Pandaisop
/

codesensei-env

Sleeping

App Files Files Community

codesensei-env

Commit History

fix: resolve 500 error on /schema and add extra validation tasks

52fe477

vineetshukla.work@gmail.com commited on Apr 12

feat: implement diverse autonomous grader for real-world code debugging tasks

d09b739

vineetshukla.work@gmail.com commited on Apr 12

fix: standardize grader structure to tasks/ directory for OpenEnv validation compliance

a594b6e

vineetshukla.work@gmail.com commited on Apr 12

fix: add mandatory /metadata and /schema endpoints for OpenEnv validation and update task metadata

5a87b28

vineetshukla.work@gmail.com commited on Apr 12

Fix OpenEnv task mapping schema: Add required 'id' fields

db07239

vineetshukla.work@gmail.com commited on Apr 12

Fix OpenEnv validation: Use explicit python grader for tasks

9bd4a93

vineetshukla.work@gmail.com commited on Apr 12

revert(training): temporarily revert dataset bounds back to 10 for complete trial confirmation before scaling

6fd478f

vineetshukla.work@gmail.com commited on Apr 12

feat(training): scale dataset to 500 prompts and add hub export cell for full hackathon training run

8b03007

vineetshukla.work@gmail.com commited on Apr 12

fix(trl): inject warnings_issued dict into unsloth model to prevent TRL GRPO init crash on PeftModels

0413c85

vineetshukla.work@gmail.com commited on Apr 12

fix(trl): force hardware-specific fp16/bf16 flags to prevent TRL from crashing on T4 GPU

6262683

vineetshukla.work@gmail.com commited on Apr 12

fix(trl): mock weave.trace to complete package bypass

6f6568f

vineetshukla.work@gmail.com commited on Apr 12

fix(trl): add weave to the magic mock bypass list for cell 5

7ea8558

vineetshukla.work@gmail.com commited on Apr 12

fix(trl): inject magicmock bypass into cell 5 to squash TRL 0.24.0 import crash

e0923c4

vineetshukla.work@gmail.com commited on Apr 12

feat: complete rewrite of grpo training script using Unsloth optimized pathway

aa86916

vineetshukla.work@gmail.com commited on Apr 12

chore: test trial run for colab script with 10 dataset size and fixed dependencies

a59d79f

vineetshukla.work@gmail.com commited on Apr 12

fix: downgrade TRL to 0.15.0 and nuke ghost vLLM install to prevent import crashes

b119c82

vineetshukla.work@gmail.com commited on Apr 11

fix: correct trl version pin to 0.17.0

a46eca2

vineetshukla.work@gmail.com commited on Apr 11

fix: vLLM/TRL version conflict in Colab Cell 1

0b9cc90

vineetshukla.work@gmail.com commited on Apr 11

fix: training reward bounds to (0.01, 0.99) — prevents loss=0 from zero variance

545acf0

vineetshukla.work@gmail.com commited on Apr 11

fix: task routing by name, remove out-of-range rewards, add grader field to tasks

b64950c

vineetshukla.work@gmail.com commited on Apr 11

fix: strictly bound task scores to satisfy OpenEnv Phase 2 testing bounds

5fcb94c

vineetshukla.work@gmail.com commited on Apr 8

fix: allow empty body for /reset endpoint to pass auto-grader

46525cf

vineetshukla.work@gmail.com commited on Apr 8

feat: switch default API to Groq (llama-3.3-70b-versatile) for faster inference

a508ad3

vineetshukla.work@gmail.com commited on Apr 8

feat: Introduce environment models, a restricted code sandbox, and a test runner for secure LLM code evaluation.

14d1002

vineetshukla.work@gmail.com commited on Apr 8

docs: rewrite README, clean up repo structure

f3f5cb0

vineetshukla.work@gmail.com commited on Apr 8

fix: use Qwen2.5-Coder-32B-Instruct (available on HF router) instead of Qwen3-1.7B

1caebb9

vineetshukla.work@gmail.com commited on Apr 8

feat: add pre-submission compliance checker

181b78a

vineetshukla.work@gmail.com commited on Apr 8

feat: add pyproject.toml, server entry point, uv.lock for openenv validate compliance

1adae3f

vineetshukla.work@gmail.com commited on Apr 8

fix: rewrite inference.py & client.py to match OpenEnv sample pattern

6cbd2ef

vineetshukla.work@gmail.com commited on Apr 8

feat: add inference.py, openenv.yaml, bug dataset & openai dep for hackathon submission

01620c1

vineetshukla.work@gmail.com commited on Apr 8

fix: increase num_generations=6, temperature=0.9, lr=2e-5 to fix zero-loss GRPO

b3bf487

vineetshukla.work@gmail.com commited on Mar 28

fix: remove fp16=True, use paged_adamw_8bit for QLoRA BFloat16 compatibility

038f1b4

vineetshukla.work@gmail.com commited on Mar 28

fix: add 4-bit quantization + LoRA to fit GRPO training in T4 15GB VRAM

90d6d56

vineetshukla.work@gmail.com commited on Mar 28

refactor: rewrite training script - remove rollout_func, use inline reward evaluation

d81cba7

vineetshukla.work@gmail.com commited on Mar 27

fix: remove deprecated GRPOConfig params for latest TRL compatibility

0589e5e

vineetshukla.work@gmail.com commited on Mar 27

fix: set live HF Space URL in training script and demo app

d12dd6e

vineetshukla.work@gmail.com commited on Mar 27

fix: add HF Spaces YAML frontmatter to README.md

64c9646

vineetshukla.work@gmail.com commited on Mar 27

chore: add .gitignore, remove pycache from tracking

ad41893

vineetshukla.work@gmail.com commited on Mar 27

feat: CodeSensei - GRPO-trained LLM code debugger on OpenEnv

c47c81c

vineetshukla.work@gmail.com commited on Mar 27

Commit History

fix: resolve 500 error on /schema and add extra validation tasks 52fe477

feat: implement diverse autonomous grader for real-world code debugging tasks d09b739

fix: standardize grader structure to tasks/ directory for OpenEnv validation compliance a594b6e

fix: add mandatory /metadata and /schema endpoints for OpenEnv validation and update task metadata 5a87b28

Fix OpenEnv task mapping schema: Add required 'id' fields db07239

Fix OpenEnv validation: Use explicit python grader for tasks 9bd4a93

revert(training): temporarily revert dataset bounds back to 10 for complete trial confirmation before scaling 6fd478f

feat(training): scale dataset to 500 prompts and add hub export cell for full hackathon training run 8b03007

fix(trl): inject warnings_issued dict into unsloth model to prevent TRL GRPO init crash on PeftModels 0413c85

fix(trl): force hardware-specific fp16/bf16 flags to prevent TRL from crashing on T4 GPU 6262683

fix(trl): mock weave.trace to complete package bypass 6f6568f

fix(trl): add weave to the magic mock bypass list for cell 5 7ea8558

fix(trl): inject magicmock bypass into cell 5 to squash TRL 0.24.0 import crash e0923c4

feat: complete rewrite of grpo training script using Unsloth optimized pathway aa86916

chore: test trial run for colab script with 10 dataset size and fixed dependencies a59d79f

fix: downgrade TRL to 0.15.0 and nuke ghost vLLM install to prevent import crashes b119c82

fix: correct trl version pin to 0.17.0 a46eca2

fix: vLLM/TRL version conflict in Colab Cell 1 0b9cc90

fix: training reward bounds to (0.01, 0.99) — prevents loss=0 from zero variance 545acf0

fix: task routing by name, remove out-of-range rewards, add grader field to tasks b64950c

fix: strictly bound task scores to satisfy OpenEnv Phase 2 testing bounds 5fcb94c

fix: allow empty body for /reset endpoint to pass auto-grader 46525cf

feat: switch default API to Groq (llama-3.3-70b-versatile) for faster inference a508ad3

feat: Introduce environment models, a restricted code sandbox, and a test runner for secure LLM code evaluation. 14d1002

docs: rewrite README, clean up repo structure f3f5cb0

fix: use Qwen2.5-Coder-32B-Instruct (available on HF router) instead of Qwen3-1.7B 1caebb9

feat: add pre-submission compliance checker 181b78a

feat: add pyproject.toml, server entry point, uv.lock for openenv validate compliance 1adae3f

fix: rewrite inference.py & client.py to match OpenEnv sample pattern 6cbd2ef

feat: add inference.py, openenv.yaml, bug dataset & openai dep for hackathon submission 01620c1

fix: increase num_generations=6, temperature=0.9, lr=2e-5 to fix zero-loss GRPO b3bf487

fix: remove fp16=True, use paged_adamw_8bit for QLoRA BFloat16 compatibility 038f1b4

fix: add 4-bit quantization + LoRA to fit GRPO training in T4 15GB VRAM 90d6d56

refactor: rewrite training script - remove rollout_func, use inline reward evaluation d81cba7

fix: remove deprecated GRPOConfig params for latest TRL compatibility 0589e5e

fix: set live HF Space URL in training script and demo app d12dd6e

fix: add HF Spaces YAML frontmatter to README.md 64c9646

chore: add .gitignore, remove __pycache__ from tracking ad41893

feat: CodeSensei - GRPO-trained LLM code debugger on OpenEnv c47c81c

fix: resolve 500 error on /schema and add extra validation tasks

52fe477

feat: implement diverse autonomous grader for real-world code debugging tasks

d09b739

fix: standardize grader structure to tasks/ directory for OpenEnv validation compliance

a594b6e

fix: add mandatory /metadata and /schema endpoints for OpenEnv validation and update task metadata

5a87b28

Fix OpenEnv task mapping schema: Add required 'id' fields

db07239

Fix OpenEnv validation: Use explicit python grader for tasks

9bd4a93

revert(training): temporarily revert dataset bounds back to 10 for complete trial confirmation before scaling

6fd478f

feat(training): scale dataset to 500 prompts and add hub export cell for full hackathon training run

8b03007

fix(trl): inject warnings_issued dict into unsloth model to prevent TRL GRPO init crash on PeftModels

0413c85

fix(trl): force hardware-specific fp16/bf16 flags to prevent TRL from crashing on T4 GPU

6262683

fix(trl): mock weave.trace to complete package bypass

6f6568f

fix(trl): add weave to the magic mock bypass list for cell 5

7ea8558

fix(trl): inject magicmock bypass into cell 5 to squash TRL 0.24.0 import crash

e0923c4

feat: complete rewrite of grpo training script using Unsloth optimized pathway

aa86916

chore: test trial run for colab script with 10 dataset size and fixed dependencies

a59d79f

fix: downgrade TRL to 0.15.0 and nuke ghost vLLM install to prevent import crashes

b119c82

fix: correct trl version pin to 0.17.0

a46eca2

fix: vLLM/TRL version conflict in Colab Cell 1

0b9cc90

fix: training reward bounds to (0.01, 0.99) — prevents loss=0 from zero variance

545acf0

fix: task routing by name, remove out-of-range rewards, add grader field to tasks

b64950c

fix: strictly bound task scores to satisfy OpenEnv Phase 2 testing bounds

5fcb94c

fix: allow empty body for /reset endpoint to pass auto-grader

46525cf

feat: switch default API to Groq (llama-3.3-70b-versatile) for faster inference

a508ad3

feat: Introduce environment models, a restricted code sandbox, and a test runner for secure LLM code evaluation.

14d1002

docs: rewrite README, clean up repo structure

f3f5cb0

fix: use Qwen2.5-Coder-32B-Instruct (available on HF router) instead of Qwen3-1.7B

1caebb9

feat: add pre-submission compliance checker

181b78a

feat: add pyproject.toml, server entry point, uv.lock for openenv validate compliance

1adae3f

fix: rewrite inference.py & client.py to match OpenEnv sample pattern

6cbd2ef

feat: add inference.py, openenv.yaml, bug dataset & openai dep for hackathon submission

01620c1

fix: increase num_generations=6, temperature=0.9, lr=2e-5 to fix zero-loss GRPO

b3bf487

fix: remove fp16=True, use paged_adamw_8bit for QLoRA BFloat16 compatibility

038f1b4

fix: add 4-bit quantization + LoRA to fit GRPO training in T4 15GB VRAM

90d6d56

refactor: rewrite training script - remove rollout_func, use inline reward evaluation

d81cba7

fix: remove deprecated GRPOConfig params for latest TRL compatibility

0589e5e

fix: set live HF Space URL in training script and demo app

d12dd6e

fix: add HF Spaces YAML frontmatter to README.md

64c9646

chore: add .gitignore, remove pycache from tracking

ad41893

feat: CodeSensei - GRPO-trained LLM code debugger on OpenEnv

c47c81c