fix(inference): refine scaling and rerouting rules and action behavior d062bfb div18 commited on 19 days ago
refactor(simulator): add feedback on action effects and enforce capacity discipline f55f75f div18 commited on 19 days ago
feat: implement Kubernetes executor for automated cluster scaling and infrastructure management cf2697b div18 commited on 21 days ago
feat(curriculum): add progressive training curriculum management 52a986a div18 commited on 24 days ago
feat: implement core SRE simulation environment, Pydantic schemas, and physics models for task-based cluster management 5144b7e div18 commited on Apr 1
update cost model and observation descriptions for clarity and accuracy f656047 Keshav051 commited on Mar 31
added dropout, probability, normalization, and limits to values. Making the environment more challenging and balanced bba6f8a PranavKK1201 commited on Mar 30
integrate production simulator, stability math, and reward recalibration 654c8c7 PranavKK1201 commited on Mar 27