armaan020 commited on
Commit
121fad1
Β·
verified Β·
1 Parent(s): dbf6460

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +96 -94
  2. inference.py +6 -1
  3. walkthrough.md +61 -0
README.md CHANGED
@@ -1,94 +1,96 @@
1
- ---
2
- title: AegisOpenEnv
3
- emoji: 🏦
4
- colorFrom: indigo
5
- colorTo: gray
6
- sdk: docker
7
- pinned: false
8
- license: mit
9
- ---
10
-
11
- # 🏦 AegisOpenEnv: AI-Powered Financial Compliance Sandbox
12
-
13
- **AegisOpenEnv** is a high-fidelity Reinforcement Learning environment designed for the **Meta OpenEnv** competition. It translates complex banking compliance regulations into a rigorous, text-augmented simulation for training autonomous financial auditors.
14
-
15
- ---
16
-
17
- ## πŸ›οΈ Why AegisOpenEnv?
18
- Financial institutions screen millions of transactions daily. Traditional rule-based systems often struggle with **"smurfing"** (structuring transactions just under reporting limits) or adapting to new **Sanctions Lists**.
19
-
20
- AegisOpenEnv allows LLM-based agents to:
21
- - **Audit Raw Transactions**: Process complex histories and account metadata.
22
- - **Reason with Regulations**: Dynamically fetch and cite clauses like the **EU AI Act** or **BSA**.
23
- - **Learn from Feedback**: Use modular reward signals to optimize for high precision and low false positives.
24
-
25
- ---
26
-
27
- ## πŸ› οΈ Task Catalog
28
-
29
- Our environment features a 3-tier difficulty system to evaluate various auditor competencies:
30
-
31
- | Phase | Task ID | Name | Difficulty | Competency Evaluated |
32
- | :--- | :--- | :--- | :--- | :--- |
33
- | **I** | `easy_audit` | Sanction Check | 🟒 Easy | Blacklist matching and deterministic identification. |
34
- | **II** | `medium_audit` | Smurfing Detection | 🟑 Medium | Pattern recognition across temporal windows. |
35
- | **III** | `hard_audit` | Regulatory Alignment | πŸ”΄ Hard | Legal reasoning and precise clause citation. |
36
-
37
- ---
38
-
39
- ## πŸ‘οΈ Environment Specification
40
-
41
- ### πŸ“ Action Space (`AuditAction`)
42
- Agents respond with structured JSON containing:
43
- - `action_type`: `APPROVE`, `FLAG`, or `REQUEST_INFO`.
44
- - `target_id`: The identifier of the account or transaction under review.
45
- - `regulation_citation`: A direct citation of the violated regulation (Required for Hard tier).
46
-
47
- ### πŸ‘οΈ Observation Space (`AuditObservation`)
48
- Agents receive:
49
- - `transactions`: Real-time transaction flux.
50
- - `account_metadata`: Profile data (age, tier, risk level).
51
- - `retrieved_regs`: Dynamic context window containing regulatory guidelines.
52
- - `reward`: The score from the previous action.
53
-
54
- ### 🎯 Reward Structure
55
- AegisOpenEnv prioritizes **Zero-Tolerance Compliance**:
56
- - **Successful Audit**: +0.5 to +1.0 (Identification + Citation).
57
- - **False Positive**: -1.0 (Inefficiency penalty).
58
- - **Missed Detection (False Negative)**: **-5.0** (Critical regulatory failure).
59
-
60
- ---
61
-
62
- ## πŸš€ Quick Start
63
-
64
- ### Installation
65
- ```bash
66
- pip install -r requirements.txt
67
- ```
68
-
69
- ### Local Validation
70
- ```bash
71
- # Start the server
72
- uvicorn app:app --port 7860
73
-
74
- # Run OpenEnv validate
75
- openenv validate http://localhost:7860
76
- ```
77
-
78
- ### Inference Baseline
79
- Ensure you have set your `OPENAI_API_KEY` and `API_BASE_URL` (supports OpenRouter):
80
- ```powershell
81
- $env:API_BASE_URL = "https://openrouter.ai/api/v1"
82
- $env:MODEL_NAME = "stepfun/step-3.5-flash:free"
83
- python inference.py
84
- ```
85
-
86
- ---
87
-
88
- ## 🏁 Compliance Status
89
- This environment is **100% Compliant** with the Meta OpenEnv specification.
90
- - **Validation URL**: [armaan020-aegisopenenv.hf.space/health](https://armaan020-aegisopenenv.hf.space/health)
91
- - **Repo Walkthrough**: View `walkthrough.md` for training logs and REINFORCE results.
92
-
93
- ---
94
- *Created for the Meta OpenEnv Prize Pool. Part of the Aegis compliance suite.*
 
 
 
1
+ ---
2
+ title: AegisOpenEnv
3
+ emoji: 🏦
4
+ colorFrom: indigo
5
+ colorTo: gray
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ ---
10
+
11
+ # 🏦 AegisOpenEnv: AI-Powered Financial Compliance Sandbox
12
+
13
+ **AegisOpenEnv** is a high-fidelity Reinforcement Learning environment designed for the **Meta OpenEnv** competition. It translates complex banking compliance regulations into a rigorous, text-augmented simulation for training autonomous financial auditors.
14
+
15
+ ---
16
+
17
+ ## πŸ›οΈ Why AegisOpenEnv?
18
+ Financial institutions screen millions of transactions daily. Traditional rule-based systems often struggle with **"smurfing"** (structuring transactions just under reporting limits) or adapting to new **Sanctions Lists**.
19
+
20
+ AegisOpenEnv allows LLM-based agents to:
21
+ - **Audit Raw Transactions**: Process complex histories and account metadata.
22
+ - **Reason with Regulations**: Dynamically fetch and cite clauses like the **EU AI Act** or **BSA**.
23
+ - **Learn from Feedback**: Use modular reward signals to optimize for high precision and low false positives.
24
+
25
+ ---
26
+
27
+ ## πŸ› οΈ Task Catalog
28
+
29
+ Our environment features a 3-tier difficulty system to evaluate various auditor competencies:
30
+
31
+ | Phase | Task ID | Name | Difficulty | Competency Evaluated |
32
+ | :--- | :--- | :--- | :--- | :--- |
33
+ | **I** | `easy_audit` | Sanction Check | 🟒 Easy | Blacklist matching and deterministic identification. |
34
+ | **II** | `medium_audit` | Smurfing Detection | 🟑 Medium | Pattern recognition across temporal windows. |
35
+ | **III** | `hard_audit` | Regulatory Alignment | πŸ”΄ Hard | Legal reasoning and precise clause citation. |
36
+
37
+ ---
38
+
39
+ ## πŸ‘οΈ Environment Specification
40
+
41
+ ### πŸ“ Action Space (`AuditAction`)
42
+ Agents respond with structured JSON containing:
43
+ - `action_type`: `APPROVE`, `FLAG`, or `REQUEST_INFO`.
44
+ - `target_id`: The identifier of the account or transaction under review.
45
+ - `regulation_citation`: A direct citation of the violated regulation (Required for Hard tier).
46
+
47
+ ### πŸ‘οΈ Observation Space (`AuditObservation`)
48
+ Agents receive:
49
+ - `transactions`: Real-time transaction flux.
50
+ - `account_metadata`: Profile data (age, tier, risk level).
51
+ - `retrieved_regs`: Dynamic context window containing regulatory guidelines.
52
+ - `reward`: The score from the previous action.
53
+
54
+ ### 🎯 Reward Structure
55
+ AegisOpenEnv prioritizes **Zero-Tolerance Compliance**:
56
+ - **Successful Audit**: +0.5 to +1.0 (Identification + Citation).
57
+ - **False Positive**: -1.0 (Inefficiency penalty).
58
+ - **Missed Detection (False Negative)**: **-5.0** (Critical regulatory failure).
59
+
60
+ ---
61
+
62
+ ## πŸš€ Quick Start
63
+
64
+ ### Installation
65
+ ```bash
66
+ pip install -r requirements.txt
67
+ ```
68
+
69
+ ### Local Validation
70
+ ```bash
71
+ # Start the server
72
+ uvicorn app:app --port 7860
73
+
74
+ # Run OpenEnv validate
75
+ openenv validate http://localhost:7860
76
+ ```
77
+
78
+ ### Inference Baseline
79
+ Ensure you have set your API credentials in your terminal session:
80
+ ```powershell
81
+ $env:OPENAI_API_KEY = "your-api-key-here"
82
+ $env:API_BASE_URL = "https://openrouter.ai/api/v1"
83
+ $env:MODEL_NAME = "stepfun/step-3.5-flash:free"
84
+ $env:ENV_URL = "https://armaan020-aegisopenenv.hf.space"
85
+ python inference.py
86
+ ```
87
+
88
+ ---
89
+
90
+ ## 🏁 Compliance Status
91
+ This environment is **100% Compliant** with the Meta OpenEnv specification.
92
+ - **Validation URL**: [armaan020-aegisopenenv.hf.space/health](https://armaan020-aegisopenenv.hf.space/health)
93
+ - **Repo Walkthrough**: View `walkthrough.md` for training logs and REINFORCE results.
94
+
95
+ ---
96
+
inference.py CHANGED
@@ -26,7 +26,12 @@ client = OpenAI(
26
  default_headers=extra_headers if "openrouter" in API_BASE_URL.lower() else None
27
  )
28
 
29
- SYSTEM_PROMPT = """You are a financial compliance auditor AI.
 
 
 
 
 
30
  Respond ONLY with a JSON object:
31
  {"action_type": "FLAG|APPROVE|REQUEST_INFO", "target_id": "<id>", "regulation_citation": "<cite>"}"""
32
 
 
26
  default_headers=extra_headers if "openrouter" in API_BASE_URL.lower() else None
27
  )
28
 
29
+ SYSTEM_PROMPT = """You are a high-performance financial auditor AI.
30
+ Your goal is to maximize precision and minimize friction.
31
+ - FLAG: Use for CLEAR sanctions (BL targets) or smurfing evidence.
32
+ - APPROVE: Use for CLEAN accounts. Do NOT waste time.
33
+ - REQUEST_INFO: ONLY use if the risk is ambiguous. Unnecessary requests are penalized.
34
+
35
  Respond ONLY with a JSON object:
36
  {"action_type": "FLAG|APPROVE|REQUEST_INFO", "target_id": "<id>", "regulation_citation": "<cite>"}"""
37
 
walkthrough.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AegisOpenEnv: Final Compliance & Training Walkthrough
2
+
3
+ AegisOpenEnv is now 100% compliant with the Meta OpenEnv competition specifications. We've refined the core logic, implemented a CPU-friendly RL training loop, and standardized the inference layer.
4
+
5
+ ## πŸš€ Key Improvements
6
+
7
+ ### 1. Modular Grader Refinement
8
+ Moved hardcoded scoring logic from `server.py` to a dedicated `Grader` class in [grader.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/grader.py).
9
+ - **Centralized Rewards**: All tiers (Easy/Medium/Hard) are now scored by `grader.grade()`.
10
+ - **Penalties**: False Positive (-1.0) and False Negative (-5.0) penalties are strictly enforced.
11
+
12
+ ### 2. CPU-Optimized RL Training
13
+ Implemented [train_cpu.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/train_cpu.py) using the **REINFORCE** algorithm.
14
+ - **Goal**: Allow training on systems without high-end GPUs.
15
+ - **Dataset**: Integrated `SecureFinAI-Lab/Regulations_QA` for real-world regulatory prompts.
16
+ - **Logic**: Manual policy gradient backprop ($Loss = -log\_prob \times reward$).
17
+
18
+ ### 3. Standardized Inference
19
+ Consolidated all inference logic into a single, official [inference.py](file:///C:/Users/DELL/Desktop/metaRL/aegis_gym/inference.py).
20
+ - **Compliance**: Uses `openai.OpenAI()` client exclusively.
21
+ - **Reporting**: Automatically generates a **Reproducibility Report** with Mean Score and Compliance Status.
22
+ - **Cleanup**: Removed legacy `client.py` and `baseline_inference.py`.
23
+
24
+ ## βœ… Validation Results
25
+
26
+ The environment passed the official `openenv validate` suite with **100% SUCCESS**.
27
+
28
+ ```json
29
+ {
30
+ "target": "https://armaan020-aegisopenenv.hf.space",
31
+ "passed": true,
32
+ "required": true,
33
+ "expected": {
34
+ "/reset": true,
35
+ "/step": true,
36
+ "/state": true
37
+ },
38
+ "actual": {
39
+ "status_code": 200,
40
+ "/reset": true,
41
+ "/step": true,
42
+ "/state": true
43
+ }
44
+ }
45
+ ```
46
+
47
+ ## πŸ› οΈ How to Deploy
48
+
49
+ To finalize the submission to Hugging Face Spaces:
50
+
51
+ 1. Login to HF CLI: `huggingface-cli login`
52
+ 2. Run the deployment script:
53
+ ```bash
54
+ python deploy_hf.py armaan020/AegisOpenEnv --public
55
+ ```
56
+
57
+ > [!IMPORTANT]
58
+ > The environment expects the model `Qwen/Qwen2.5-0.5B-Instruct` by default for CPU training. You can override this using the `MODEL_NAME` environment variable.
59
+
60
+ ---
61
+ *Created for the Meta OpenEnv Prize Pool. Part of the Aegis compliance suite.*