Pranavkk
/

AntiAtropos

openenv

Model card Files Files and versions

xet

Community

PranavKK1201 commited on 28 days ago

Commit

0a957b8

1 Parent(s): 0c74ebd

agent.md and claude.md

Browse files

Files changed (2) hide show

AGENTS.md +56 -0
CLAUDE.md +1 -0

AGENTS.md ADDED Viewed

	@@ -0,0 +1,56 @@

+# AntiAtropos: The Physics of Autonomous SRE
+> **"Infrastructure is not a static set of configurations; it is a dynamic system of energy, flow, and stability."**
+## The Vision
+AntiAtropos is a next-generation **Autonomous SRE (Site Reliability Engineering) Control Environment**. While traditional DevOps relies on static thresholds (e.g., "if CPU > 80%"), AntiAtropos treats a microservice cluster as a **Physics Engine**.
+Our vision is to move from reactive scripts to **Dynamical System Control**. We are building an environment where AI agents don't just "fix things"—they balance the "Potential Energy" of a cluster to maintain equilibrium under extreme pressure.
+---
+## 1. The Physics Engine Concept
+Traditional observability measures metrics; we measure **Stability**. We have modeled our 10-node cluster using **Fluid Queue Dynamics**, treating request flow like water and nodes like reservoirs.
+### The Lyapunov Potential ($V$)
+The "North Star" of our environment is the **Lyapunov Energy Function**:
+$$V(s) = \sum_{i=1}^{N} w_i \cdot Q_i^2$$
+*   **$Q_i$ (Queue Depth):** The "Potential Energy" or mass accumulated in a service.
+*   **$w_i$ (Weight):** The "Gravity" or business importance (node-0 is the VIP Payment Gateway).
+*   **Cascading Failures:** Our physics engine models "Backlog Pressure," where one failing node can trigger a chain reaction across its neighbors.
+### Advanced Latency Dynamics (M/M/1)
+We move beyond linear latency models. AntiAtropos implements a **"Hockey-Stick" Latency Curve**. As utilization approaches 100%, latency increases exponentially—modeling the "Point of No Return" that real-world on-call engineers fear.
+---
+## 2. Training Strategy: The Professional Loop
+To build a hackathon-winning agent, we use a complex training pipeline coordinated between **Google Colab** and **Hugging Face**:
+### Progressive Curriculum Learning
+Agents are not trained at random. They follow a **Curriculum** (`curriculum.py`) that graduates them through increasingly difficult stages:
+1.  **Stage 1-3:** Capacity Ramping (Learning to scale).
+2.  **Stage 4-5:** Fault Tolerance (Learning to reroute).
+3.  **Stage 6-8:** Surge Stability (Learning to balance competing pressures).
+4.  **Finals:** Sustained protection under cascading failure conditions.
+### Episodic Replay Buffer
+Using `replay.py`, our agents maintain a "Long-term Memory" of **Key Transitions**. Instead of relearning from scratch, the model uses **Few-Shot Demonstrations** to see how successful previous strategies were executed.
+---
+## 3. Upcoming & Unconfirmed Roadmap
+> [!IMPORTANT]
+> **DISCLAIMER:** The following features are in the research phase and are NOT yet finalized or confirmed. Please consult with the core team before assuming implementation details.
+*   **Multi-Token Attention for SRE:** Investigating the use of frequency-selective transformation to capture "cluster breathiness" (p99 jitter) rather than just global averages.
+*   **Graph Neural Network (GNN) Control:** Potential pivot toward modeling the cluster as a dynamic graph to directly manage the "topology of stress."
+*   **Cross-Cluster Generalization:** Testing models trained on 10 nodes against 20 and 50 node environments.
+---
+## Why This Wins
+AntiAtropos doesn't follow runbooks. It understands the **laws of motion** within a cluster. By training agents to minimize "System Energy," we create infrastructure that is inherently self-healing, cost-efficient, and mathematically stable.
+---
+*Created for the 2026 AntiAtropos Hackathon.*

CLAUDE.md ADDED Viewed

	@@ -0,0 +1 @@


1	+ Refer to AGENT.md for instructions