Spaces:
Sleeping
Sleeping
| spec_version: 1 | |
| name: fin_auditor | |
| type: space | |
| runtime: fastapi | |
| app: server.app:app | |
| port: 7860 | |
| entrypoint: server.fin_auditor_environment:FinAuditorEnvironment | |
| tasks: | |
| - id: anomaly_detection_easy | |
| name: "Anomaly Detection (EASY)" | |
| description: > | |
| EASY — Detect expired/unreconciled trades. After a reset, trades | |
| are ingested and immediately aged past the 5-second max window. | |
| The agent must call /step and receive the anomaly matrix. A valid | |
| response requires at least one trade flagged (decision=1) in the | |
| action sent to /step. | |
| grader: "graders.grader_detection:EasyDetectionGrader" | |
| difficulty: "easy" | |
| max_steps: 5 | |
| reward_threshold: 0.5 | |
| score_range: [0.0, 1.0] | |
| - id: anomaly_detection_medium | |
| name: "Anomaly Detection (MEDIUM)" | |
| description: > | |
| MEDIUM — Identify high-risk counterparties. The agent must process | |
| the anomaly matrix and flag all trades where risk_score > 0.5. | |
| Reward is based on precision of high-risk flags. Requires the LLM to | |
| parse the feature array and make discriminating decisions. | |
| grader: "graders.grader_classification:MediumClassificationGrader" | |
| difficulty: "medium" | |
| max_steps: 10 | |
| reward_threshold: 0.6 | |
| score_range: [0.0, 1.0] | |
| - id: anomaly_detection_hard | |
| name: "Anomaly Detection (HARD)" | |
| description: > | |
| HARD — Systemic anomaly triage. The agent must detect when | |
| missing_frequency > 0.3, indicating a systemic anomaly wave, | |
| and correctly flag ALL trades in the batch. Deducts points for | |
| false-positive flags during safe periods. | |
| grader: "graders.grader_fix:HardFixGrader" | |
| difficulty: "hard" | |
| max_steps: 20 | |
| reward_threshold: 0.7 | |
| score_range: [0.0, 1.0] |