Commit History

Hackathon submission: new README (3-5 min read), BLOG.md narrative, frontier baselines, design-principles framing
40de84e
verified

yashash045 commited on

Phase J.7: add /curriculum_progress endpoint + GRPO polling for mastery telemetry
1f942fb

yashash04 commited on

Phase D: pipeline_environment.py surgery — cut adversarial designer, handoff tracker, handoff/specialization rewards
ca596ec

yashash04 commited on

Phase C: cut handoff_quality_reward + role_specialization_bonus, lower STEP_REWARD_MAX to 0.32
bbc878a

yashash04 commited on

Phase A: cut adversarial_designer, ollama_client, judge_client, handoff_metrics
df13833

yashash04 commited on

Phase 6.5 fix (4/5): reward magnitude tune-up (terminal bonuses/penalties)
f688087

yashash04 commited on

Phase 6.5 fix (3/5): Groq LLM judge client with Ollama fallback
04815c6

yashash04 commited on

Pre-Phase 6: retry on Ollama client + mark live tests requires_ollama
13e0416

yashash04 commited on

Phase 5.7: wire adversarial scenarios into engine state
61783d3

yashash04 commited on

Phase 5.6: curriculum tracking
8cd55d8

yashash04 commited on

Phase 5.5: reward additions
12502fc

yashash04 commited on

Phase 5.4: handoff scoring + role routing
155fda0

yashash04 commited on

Phase 5.3: step validation
6a48747

yashash04 commited on

Phase 5.2: update reset
d84d4b7

yashash04 commited on

Phase 5.1: wire init
4eb4d05

yashash04 commited on

Phase 4: handoff metrics
962efb9

yashash04 commited on

Phase 3 cleanup: enforce step budget in designer._parse
dc6970a

yashash04 commited on

Phase 3: ollama client + adversarial designer
a87e602

yashash04 commited on

Phase 2: curriculum controller
3736d30

yashash04 commited on

Phase 1: role system
305410b

yashash04 commited on

Round 2: rename Python package to devops_pipeline_gym
1f80eda

yashash04 commited on

Fix: all score paths return strictly (0,1) — never 0.0 or 1.0
4681517

yashash04 commited on

Fix: clamp grader scores to (0.001, 0.999) — strict 0<score<1
a651167

yashash04 commited on

Fix health visibility leak + recovery alert bug
4c913de

yashash04 commited on

Fix grader weights, harden int() casts, fix partial obs leak, add difficulty + exploit docs
54bdcbb

yashash04 commited on

Harden random_incident grader, fix /grader default, remove prescriptive logs from hard tasks, add recovery status to obs, stochastic docs
40168c6

yashash04 commited on

Final: config recovery delay, expand proc gen (5 types + compound), partial obs fix, reward cap +0.30, MDP docs, seed curriculum
512fb6e

yashash04 commited on

Fix _failing_service bug, add shared_buffers hint, fix partial obs leak, add MDP description
1e96d44

yashash04 commited on

Final fixes: deploy-time grader check, anti-spam penalty, public attrs, seed at reset, remove openai from reqs, baseline recalibration
decc7bb

yashash04 commited on

Round 2 judge fixes: reward pipeline, sub-goals, exploration decay, grader depth
5af7f3e

yashash04 commited on

Fix all 3-judge review findings: rewards, graders, engine ordering, spec compliance
a13085e

yashash04 commited on

Add procedural scenario generation: random_incident task (Task 6)
470dbc1

yashash04 commited on

Skip compounding/tipping for clean_deploy, remove dead import, fix health endpoint
e200fa5

yashash04 commited on

Make clean_deploy truly easy: skip transient staging failure, reduce deploy spikes
a6bdc1f

yashash04 commited on

Fix task selection: accept task in reset() body, not just env var
8e28062

yashash04 commited on

Make staging→prod flow obvious, differentiate baseline scores
b28fab6

yashash04 commited on

Fix capacity_crisis grader: optimal now beats random, recalibrate scoring
803c960

yashash04 commited on

Add observation summary field, rewrite inference system prompt, sorted JSON keys
ea8b901

yashash04 commited on

Add capacity_crisis task (5th task) — prevent collapse under 4x traffic
497414b

yashash04 commited on

Add non-linear tipping points for emergent behavior
e7011f3

yashash04 commited on

Add auth-service to all 4 scenarios with updated dependency graph
d488315

yashash04 commited on

Add database-primary service to all 4 scenarios with dependency graph, fix graders to count only target services
89a7e34

yashash04 commited on

Reward shaping audit: add repeat-action penalty, reward bounds [-0.35, +0.20], repeated investigation penalty
e224ee7

yashash04 commited on

Fix 5 evaluator issues: outcome-based grader, reduced warmup spikes, app.state for grader, remove openai from server deps, tune time pressure
4638f7c

yashash04 commited on

Add trade-off effects, cross-metric compounding, recovery cascade, 3-path grader, non-linear deploys
0e53462

yashash04 commited on

DevOps Pipeline OpenEnv Environment - Full submission
e96e39f

yashash04 commited on