Spaces:

armaan020
/

AegisOpenEnv

Sleeping

App Files Files Community

AegisOpenEnv / walkthrough.md

armaan020

Upload folder using huggingface_hub

121fad1 verified 25 days ago

preview code

raw

history blame contribute delete

2.41 kB

AegisOpenEnv: Final Compliance & Training Walkthrough

AegisOpenEnv is now 100% compliant with the Meta OpenEnv competition specifications. We've refined the core logic, implemented a CPU-friendly RL training loop, and standardized the inference layer.

🚀 Key Improvements

1. Modular Grader Refinement

Moved hardcoded scoring logic from server.py to a dedicated Grader class in grader.py.

Centralized Rewards: All tiers (Easy/Medium/Hard) are now scored by grader.grade().
Penalties: False Positive (-1.0) and False Negative (-5.0) penalties are strictly enforced.

2. CPU-Optimized RL Training

Implemented train_cpu.py using the REINFORCE algorithm.

Goal: Allow training on systems without high-end GPUs.
Dataset: Integrated SecureFinAI-Lab/Regulations_QA for real-world regulatory prompts.
Logic: Manual policy gradient backprop ($Loss = -log_prob \times reward$).

3. Standardized Inference

Consolidated all inference logic into a single, official inference.py.

Compliance: Uses openai.OpenAI() client exclusively.
Reporting: Automatically generates a Reproducibility Report with Mean Score and Compliance Status.
Cleanup: Removed legacy client.py and baseline_inference.py.

✅ Validation Results

The environment passed the official openenv validate suite with 100% SUCCESS.

{
  "target": "https://armaan020-aegisopenenv.hf.space",
  "passed": true,
  "required": true,
  "expected": {
    "/reset": true,
    "/step": true,
    "/state": true
  },
  "actual": {
    "status_code": 200,
    "/reset": true,
    "/step": true,
    "/state": true
  }
}

🛠️ How to Deploy

To finalize the submission to Hugging Face Spaces:

Run the deployment script:

python deploy_hf.py armaan020/AegisOpenEnv --public

The environment expects the model Qwen/Qwen2.5-0.5B-Instruct by default for CPU training. You can override this using the MODEL_NAME environment variable.

Created for the Meta OpenEnv Prize Pool. Part of the Aegis compliance suite.