Spaces:

armaan020
/

AegisOpenEnv

Sleeping

App Files Files Community

armaan020 commited on 23 days ago

Commit

121fad1

verified ·

1 Parent(s): dbf6460

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +96 -94
inference.py +6 -1
walkthrough.md +61 -0

README.md CHANGED Viewed

@@ -1,94 +1,96 @@
----
-title: AegisOpenEnv
-emoji: 🏦
-colorFrom: indigo
-colorTo: gray
-sdk: docker
-pinned: false
-license: mit
----
-# 🏦 AegisOpenEnv: AI-Powered Financial Compliance Sandbox
-**AegisOpenEnv** is a high-fidelity Reinforcement Learning environment designed for the **Meta OpenEnv** competition. It translates complex banking compliance regulations into a rigorous, text-augmented simulation for training autonomous financial auditors.
----
-## 🏛️ Why AegisOpenEnv?
-Financial institutions screen millions of transactions daily. Traditional rule-based systems often struggle with **"smurfing"** (structuring transactions just under reporting limits) or adapting to new **Sanctions Lists**.
-AegisOpenEnv allows LLM-based agents to:
-- **Audit Raw Transactions**: Process complex histories and account metadata.
-- **Reason with Regulations**: Dynamically fetch and cite clauses like the **EU AI Act** or **BSA**.
-- **Learn from Feedback**: Use modular reward signals to optimize for high precision and low false positives.
----
-## 🛠️ Task Catalog
-Our environment features a 3-tier difficulty system to evaluate various auditor competencies:
-| Phase | Task ID | Name | Difficulty | Competency Evaluated |
-| :--- | :--- | :--- | :--- | :--- |
-| **I** | `easy_audit` | Sanction Check | 🟢 Easy | Blacklist matching and deterministic identification. |
-| **II** | `medium_audit` | Smurfing Detection | 🟡 Medium | Pattern recognition across temporal windows. |
-| **III** | `hard_audit` | Regulatory Alignment | 🔴 Hard | Legal reasoning and precise clause citation. |
----
-## 👁️ Environment Specification
-### 📝 Action Space (`AuditAction`)
-Agents respond with structured JSON containing:
-- `action_type`: `APPROVE`, `FLAG`, or `REQUEST_INFO`.
-- `target_id`: The identifier of the account or transaction under review.
-- `regulation_citation`: A direct citation of the violated regulation (Required for Hard tier).
-### 👁️ Observation Space (`AuditObservation`)
-Agents receive:
-- `transactions`: Real-time transaction flux.
-- `account_metadata`: Profile data (age, tier, risk level).
-- `retrieved_regs`: Dynamic context window containing regulatory guidelines.
-- `reward`: The score from the previous action.
-### 🎯 Reward Structure
-AegisOpenEnv prioritizes **Zero-Tolerance Compliance**:
-- **Successful Audit**: +0.5 to +1.0 (Identification + Citation).
-- **False Positive**: -1.0 (Inefficiency penalty).
-- **Missed Detection (False Negative)**: **-5.0** (Critical regulatory failure).
----
-## 🚀 Quick Start
-### Installation
-```bash
-pip install -r requirements.txt
-```
-### Local Validation
-```bash
-# Start the server
-uvicorn app:app --port 7860
-# Run OpenEnv validate
-openenv validate http://localhost:7860
-```
-### Inference Baseline
-Ensure you have set your `OPENAI_API_KEY` and `API_BASE_URL` (supports OpenRouter):
-```powershell
-$env:API_BASE_URL = "https://openrouter.ai/api/v1"
-$env:MODEL_NAME = "stepfun/step-3.5-flash:free"
-python inference.py
-```
----
-## 🏁 Compliance Status
-This environment is **100% Compliant** with the Meta OpenEnv specification.
-- **Validation URL**: [armaan020-aegisopenenv.hf.space/health](https://armaan020-aegisopenenv.hf.space/health)
-- **Repo Walkthrough**: View `walkthrough.md` for training logs and REINFORCE results.
----
-*Created for the Meta OpenEnv Prize Pool. Part of the Aegis compliance suite.*

+---
+title: AegisOpenEnv
+emoji: 🏦
+colorFrom: indigo
+colorTo: gray
+sdk: docker
+pinned: false
+license: mit
+---
+# 🏦 AegisOpenEnv: AI-Powered Financial Compliance Sandbox
+**AegisOpenEnv** is a high-fidelity Reinforcement Learning environment designed for the **Meta OpenEnv** competition. It translates complex banking compliance regulations into a rigorous, text-augmented simulation for training autonomous financial auditors.
+---
+## 🏛️ Why AegisOpenEnv?
+Financial institutions screen millions of transactions daily. Traditional rule-based systems often struggle with **"smurfing"** (structuring transactions just under reporting limits) or adapting to new **Sanctions Lists**.
+AegisOpenEnv allows LLM-based agents to:
+- **Audit Raw Transactions**: Process complex histories and account metadata.
+- **Reason with Regulations**: Dynamically fetch and cite clauses like the **EU AI Act** or **BSA**.
+- **Learn from Feedback**: Use modular reward signals to optimize for high precision and low false positives.
+---
+## 🛠️ Task Catalog
+Our environment features a 3-tier difficulty system to evaluate various auditor competencies:
+| Phase | Task ID | Name | Difficulty | Competency Evaluated |
+| :--- | :--- | :--- | :--- | :--- |
+| **I** | `easy_audit` | Sanction Check | 🟢 Easy | Blacklist matching and deterministic identification. |
+| **II** | `medium_audit` | Smurfing Detection | 🟡 Medium | Pattern recognition across temporal windows. |
+| **III** | `hard_audit` | Regulatory Alignment | 🔴 Hard | Legal reasoning and precise clause citation. |
+---
+## 👁️ Environment Specification
+### 📝 Action Space (`AuditAction`)
+Agents respond with structured JSON containing:
+- `action_type`: `APPROVE`, `FLAG`, or `REQUEST_INFO`.
+- `target_id`: The identifier of the account or transaction under review.
+- `regulation_citation`: A direct citation of the violated regulation (Required for Hard tier).
+### 👁️ Observation Space (`AuditObservation`)
+Agents receive:
+- `transactions`: Real-time transaction flux.
+- `account_metadata`: Profile data (age, tier, risk level).
+- `retrieved_regs`: Dynamic context window containing regulatory guidelines.
+- `reward`: The score from the previous action.
+### 🎯 Reward Structure
+AegisOpenEnv prioritizes **Zero-Tolerance Compliance**:
+- **Successful Audit**: +0.5 to +1.0 (Identification + Citation).
+- **False Positive**: -1.0 (Inefficiency penalty).
+- **Missed Detection (False Negative)**: **-5.0** (Critical regulatory failure).
+---
+## 🚀 Quick Start
+### Installation
+```bash
+pip install -r requirements.txt
+```
+### Local Validation
+```bash
+# Start the server
+uvicorn app:app --port 7860
+# Run OpenEnv validate
+openenv validate http://localhost:7860
+```
+### Inference Baseline
+Ensure you have set your API credentials in your terminal session:
+```powershell
+$env:OPENAI_API_KEY = "your-api-key-here"
+$env:API_BASE_URL = "https://openrouter.ai/api/v1"
+$env:MODEL_NAME = "stepfun/step-3.5-flash:free"
+$env:ENV_URL = "https://armaan020-aegisopenenv.hf.space"
+python inference.py
+```
+---
+## 🏁 Compliance Status
+This environment is **100% Compliant** with the Meta OpenEnv specification.
+- **Validation URL**: [armaan020-aegisopenenv.hf.space/health](https://armaan020-aegisopenenv.hf.space/health)
+- **Repo Walkthrough**: View `walkthrough.md` for training logs and REINFORCE results.
+---

inference.py CHANGED Viewed

@@ -26,7 +26,12 @@ client = OpenAI(
     default_headers=extra_headers if "openrouter" in API_BASE_URL.lower() else None
 )
-SYSTEM_PROMPT = """You are a financial compliance auditor AI.
 Respond ONLY with a JSON object:
 {"action_type": "FLAG|APPROVE|REQUEST_INFO", "target_id": "<id>", "regulation_citation": "<cite>"}"""

     default_headers=extra_headers if "openrouter" in API_BASE_URL.lower() else None
 )
+SYSTEM_PROMPT = """You are a high-performance financial auditor AI.
+Your goal is to maximize precision and minimize friction.
+- FLAG: Use for CLEAR sanctions (BL targets) or smurfing evidence.
+- APPROVE: Use for CLEAN accounts. Do NOT waste time.
+- REQUEST_INFO: ONLY use if the risk is ambiguous. Unnecessary requests are penalized.
 Respond ONLY with a JSON object:
 {"action_type": "FLAG|APPROVE|REQUEST_INFO", "target_id": "<id>", "regulation_citation": "<cite>"}"""

walkthrough.md ADDED Viewed

	@@ -0,0 +1,61 @@

+# AegisOpenEnv: Final Compliance & Training Walkthrough
+AegisOpenEnv is now 100% compliant with the Meta OpenEnv competition specifications. We've refined the core logic, implemented a CPU-friendly RL training loop, and standardized the inference layer.
+## 🚀 Key Improvements
+### 1. Modular Grader Refinement
+Moved hardcoded scoring logic from `server.py` to a dedicated `Grader` class in [grader.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/grader.py).
+- **Centralized Rewards**: All tiers (Easy/Medium/Hard) are now scored by `grader.grade()`.
+- **Penalties**: False Positive (-1.0) and False Negative (-5.0) penalties are strictly enforced.
+### 2. CPU-Optimized RL Training
+Implemented [train_cpu.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/train_cpu.py) using the **REINFORCE** algorithm.
+- **Goal**: Allow training on systems without high-end GPUs.
+- **Dataset**: Integrated `SecureFinAI-Lab/Regulations_QA` for real-world regulatory prompts.
+- **Logic**: Manual policy gradient backprop ($Loss = -log\_prob \times reward$).
+### 3. Standardized Inference
+Consolidated all inference logic into a single, official [inference.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/inference.py).
+- **Compliance**: Uses `openai.OpenAI()` client exclusively.
+- **Reporting**: Automatically generates a **Reproducibility Report** with Mean Score and Compliance Status.
+- **Cleanup**: Removed legacy `client.py` and `baseline_inference.py`.
+## ✅ Validation Results
+The environment passed the official `openenv validate` suite with **100% SUCCESS**.
+```json
+{
+  "target": "https://armaan020-aegisopenenv.hf.space",
+  "passed": true,
+  "required": true,
+  "expected": {
+    "/reset": true,
+    "/step": true,
+    "/state": true
+  },
+  "actual": {
+    "status_code": 200,
+    "/reset": true,
+    "/step": true,
+    "/state": true
+  }
+}
+```
+## 🛠️ How to Deploy
+To finalize the submission to Hugging Face Spaces:
+1.  Login to HF CLI: `huggingface-cli login`
+2.  Run the deployment script:
+    ```bash
+    python deploy_hf.py armaan020/AegisOpenEnv --public
+    ```
+> [!IMPORTANT]
+> The environment expects the model `Qwen/Qwen2.5-0.5B-Instruct` by default for CPU training. You can override this using the `MODEL_NAME` environment variable.
+---
+*Created for the Meta OpenEnv Prize Pool. Part of the Aegis compliance suite.*