Elite-Trade-Sentry / README.md
TheRealAIGuy's picture
E1P1 Fix Hopefully
f0023cf
metadata
title: Paygorn | HFT Auditor
emoji: πŸ›‘οΈ
colorFrom: red
colorTo: blue
sdk: docker
pinned: false
app_port: 7860
                                      β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ           β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ            β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   
                                      β–‘β–ˆβ–ˆ                       β–‘β–ˆβ–ˆ               β–‘β–ˆβ–ˆ  β–‘β–ˆβ–ˆ  
                                      β–‘β–ˆβ–ˆ                       β–‘β–ˆβ–ˆ              β–‘β–ˆβ–ˆ         
                                      β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                β–‘β–ˆβ–ˆ               β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  
                                      β–‘β–ˆβ–ˆ                       β–‘β–ˆβ–ˆ                      β–‘β–ˆβ–ˆ 
                                      β–‘β–ˆβ–ˆ                       β–‘β–ˆβ–ˆ               β–‘β–ˆβ–ˆ  β–‘β–ˆβ–ˆ  
                                      β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    β–‘β–ˆβ–ˆ        β–‘β–ˆβ–ˆ       β–‘β–ˆβ–ˆ      β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   
                                                   
                                                   
                                                   

Elite Trade Sentry: A High-Frequency Trade Reconciliation Engine

A C++20 native RL environment for microsecond-scale financial anomaly detection

OpenEnv C++20 Python 3.12 nanobind PPO Docker


πŸ” What is Elite Trade Sentry?

Elite Trade Sentry (ETS) is a production-grade reinforcement learning environment that simulates the compliance enforcement pipeline of a high-frequency trading desk. It challenges an agent to detect and classify financial anomalies β€” spoofed orders, price mismatches, expired receipts β€” in microsecond-scale trade streams.

The core engine is written entirely in C++20 and exposed to Python via nanobind with zero-copy memory semantics. The RL loop, training harness, and FastAPI server are all Python, but the computation that matters happens at C++ speed.

This is not a simulation that pretends to be fast. It's a genuine lock-free, cache-friendly data structure running inside your Python process.


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PYTHON LAYER (OpenEnv / FastAPI / Stable-Baselines3)                β”‚
β”‚                                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  train.py   β”‚    β”‚  server/app.py   β”‚    β”‚   inference.py     β”‚  β”‚
β”‚  β”‚  PPO Agent  β”‚    β”‚  FastAPI Server  β”‚    β”‚   LLM Baseline     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β”‚                             β–Ό                                        β”‚
β”‚            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚            β”‚   FinAuditorEnvironment          β”‚                       β”‚
β”‚            β”‚   server/fin_auditor_environment β”‚                       β”‚
β”‚            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                          β”‚ nanobind (zero-copy)                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  C++20 NATIVE ENGINE     β”‚                                           β”‚
β”‚                          β–Ό                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚               ReconciliationEngine                              β”‚ β”‚
β”‚  β”‚                                                                 β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  SPSC  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚ β”‚
β”‚  β”‚  β”‚ SPSCRingBufferβ”‚ ────→  β”‚  OrderPool   β”‚  β”‚  TimerWheel  β”‚  β”‚ β”‚
β”‚  β”‚  β”‚ Lock-free     β”‚       β”‚  O(1) insert β”‚  β”‚  O(1) expire β”‚  β”‚ β”‚
β”‚  β”‚  β”‚ 2M slots      β”‚       β”‚  Flat array  β”‚  β”‚  3-level     β”‚  β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚ β”‚
β”‚  β”‚                                                                 β”‚ β”‚
β”‚  β”‚       tick() β†’ drain β†’ expire β†’ get_anomaly_matrix()           β”‚ β”‚
β”‚  β”‚       compute_reward(agent_actions) β†’ float                    β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

⚑ Key Technical Features

C++20 Engine Core

Component Implementation Complexity
OrderPool Flat array with power-of-2 indexing O(1) insert, O(1) lookup
TimerWheel 3-level hierarchical slot structure O(1) schedule, O(1) expire
SPSCRingBuffer Lock-free single-producer/consumer O(1) push/pop, no mutex
ReconciliationEngine Orchestrator for all subsystems O(N) per tick (N = active)

Zero-Copy Python Bridge

The observation matrix β€” a (N, 4) float32 array of trade feature vectors β€” is returned directly from C++ memory to Python as a numpy.ndarray. There is no serialization, no copy, no allocation. Python's garbage collector keeps the engine alive while the array exists, enforced via nanobind's reference_internal return policy.

The MDP (Markov Decision Process)

State (Observation)     Action              Reward
───────────────────     ──────              ──────
Matrix (N, 4):          Per-trade:          Asymmetric:
  time_elapsed          0 = PASS            +1.0  True Positive  (TP)
  price_delta           1 = FLAG            +0.5  True Negative  (TN)
  missing_frequency                         -0.1  False Positive (FP)
  risk_score                                 0.0  False Negative (FN)

Why asymmetric? Missing a real anomaly (FN) in HFT has catastrophic systemic consequences β€” a zero reward acts as a strong penalty signal. Falsely flagging a valid trade (FP) is a minor operational cost, so it incurs a small deduction.


🎯 Task Difficulty Modes

The environment supports three difficulty levels, controlled via the TASK_ID environment variable or the /config/difficulty API endpoint:

Mode Anomaly Rate Signal Counterparty Clusters
EASY 100% anomalies Deterministic; risk_score > 0.5 always anomalous High-risk: [70–99]
MEDIUM 50% anomalies Probabilistic; 80% high-risk = anomaly, 10% FP chance High-risk: [70–99], Low-risk: [0–19]
HARD 20% anomalies Adversarial; weak correlations, high noise Mixed β€” no clean separation

πŸš€ Quick Start

Prerequisites

  • Python 3.12 (via uv)
  • C++ compiler: MSVC 2022 (Windows) or GCC/Clang (Linux)
  • CMake β‰₯ 3.15

1. Clone & Install

git clone [https://github.com/SamaKool/Paygorn.git](https://github.com/SamaKool/Paygorn.git)
cd Paygorn
pip install uv
uv sync

2. Build the C++ Engine

# Windows (Visual Studio 2022)
python build_engine.py

# Linux/macOS
python build_engine.py

# Memory-safe build for constrained environments (Docker/HF Spaces)
python build_engine.py --docker-safe

On success you will see:

βœ“  Build complete.  Extension available at: D:\Paygorn\hft_auditor.cp312-win_amd64.pyd

3. Run the Server

cd server
uv run python app.py
# β†’ http://localhost:8000

4. Train the PPO Agent

uv run python train.py
# Checkpoints saved to ./logs/ every 5,000 steps

5. Run LLM Inference Baseline

export HF_TOKEN=hf_...
uv run python inference.py

πŸ–₯️ Command Center Dashboard

The FastAPI server serves an interactive HFT Command Center at http://localhost:8000/. It provides:

  • Real-time telemetry: engine latency (ΞΌs), audit accuracy, throughput (M/s), buffer saturation
  • Authority Checklist: live status of the C++ binary, API key, model, and LLM connection
  • LLM Router Config: inject HuggingFace / OpenAI / Anthropic keys and auto-discover models
  • Manual Override: trigger reset, step_optimal, and step_random actions from the browser
  • Execution Ledger: live log of each step's reward, TP, FP, and done state
  • Classification Pulse: animated bar chart of TP/TN/FP/FN from the last step
  • HARD Mode CRT Effect: the UI applies an adversarial visual overlay when difficulty = HARD

πŸ“‘ API Reference

Endpoint Method Description
GET / GET Interactive HFT Command Center UI
GET /state GET Single source of truth: all telemetry + health
POST /api/reset POST Flush SPSC buffer and reset episode
POST /api/step POST Execute one agent step (random, perfect, llm, ppo)
POST /config/llm POST Validate API key and discover provider models
POST /config/default POST Zero-config mode using HF_TOKEN env var
POST /config/difficulty POST Switch difficulty (EASY, MEDIUM, HARD_ADVERSARIAL)
GET /docs GET Full OpenAPI / Swagger documentation

Example: Run a step with a random agent

curl -X POST http://localhost:8000/api/step \
  -H "Content-Type: application/json" \
  -d '{"action_type": "random"}'
{
  "status": "success",
  "step": 1,
  "reward": 0.3125,
  "done": false,
  "tp": 8,
  "tn": 12,
  "fp": 3,
  "fn": 17
}

🧠 Python Integration

Direct Environment Usage

from server.fin_auditor_environment import FinAuditorEnvironment
from models import AuditorAction
import numpy as np

env = FinAuditorEnvironment()

# Reset β€” returns an empty feature matrix to flush state
obs = env.reset()

# Step β€” generate a batch, expire trades, compute reward
action = AuditorAction(decisions=[1] * 40)   # flag all as anomalous
obs = env.step(action)

print(f"Reward:   {obs.reward:.4f}")
print(f"Features: {np.array(obs.features).shape}")  # (N, 4)
print(f"TP/FP:    {env.state.last_tp} / {env.state.last_fp}")

Gymnasium Wrapper (for PPO / SB3)

from train import GymnasiumFinAuditorEnv
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize

env = DummyVecEnv([lambda: GymnasiumFinAuditorEnv()])
env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10.0)

model = PPO("MlpPolicy", env, verbose=1, n_steps=256, batch_size=64)
model.learn(total_timesteps=100_000)
model.save("ppo_fin_auditor_final")

LLM Zero-Shot Baseline (via HuggingFace Router)

from openai import AsyncOpenAI

client = AsyncOpenAI(
    api_key="hf_...",
    base_url="[https://router.huggingface.co/v1](https://router.huggingface.co/v1)"
)

# The engine gives the LLM a JSON decision schema
# The LLM outputs {"decisions": [0, 1, 1, 0, ...]} for 40 trades
response = await client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[{"role": "user", "content": "Flag anomalies in this trading batch."}],
    response_format={"type": "json_object"},
    temperature=0.2
)

🏭 Project Structure

Paygorn/
β”‚
β”œβ”€β”€ hf auditor/                         # C++20 Engine Source
β”‚   β”œβ”€β”€ CMakeLists.txt                  # Build configuration
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ auditor.cpp                 # nanobind module entry point
β”‚   β”‚   β”œβ”€β”€ reconciliation_engine.hpp   # Main orchestrator (C++20)
β”‚   β”‚   β”œβ”€β”€ order_pool.hpp              # O(1) flat-array trade storage
β”‚   β”‚   β”œβ”€β”€ spsc_ring_buffer.hpp        # Lock-free SPSC queue
β”‚   β”‚   └── timer_wheel.hpp             # O(1) hierarchical expiration
β”‚   └── benchmarks/
β”‚       └── benchmark_engine.cpp        # Google Benchmark harness
β”‚
β”œβ”€β”€ server/                             # Python API Layer
β”‚   β”œβ”€β”€ app.py                          # FastAPI server + Command Center UI
β”‚   β”œβ”€β”€ fin_auditor_environment.py      # OpenEnv environment wrapper
β”‚   └── __init__.py
β”‚
β”œβ”€β”€ build_engine.py                     # Cross-platform CMake build script
β”œβ”€β”€ train.py                            # PPO training loop (Stable-Baselines3)
β”œβ”€β”€ inference.py                        # LLM zero-shot inference baseline
β”œβ”€β”€ final_check.py                      # End-to-end validation harness
β”œβ”€β”€ models.py                           # Pydantic Action / Observation types
β”œβ”€β”€ openenv.yaml                        # OpenEnv task manifest
β”œβ”€β”€ pyproject.toml                      # Python project metadata
β”œβ”€β”€ Dockerfile                          # Multi-stage Docker build
└── README.md                           # This file

🐳 Docker & Hugging Face Spaces

The Dockerfile uses a 2-stage build strategy for memory-constrained environments:

  1. Stage 1 (builder): Installs Python deps, then checks for a pre-compiled .so. If found, it uses it directly. If not, it compiles with -j1 -O1 (--docker-safe) to cap RAM usage at ~1.2 GB vs ~5 GB for a full -O3 build.

  2. Stage 2 (runtime): A minimal image containing only the venv, app code, and the compiled .so.

# Local build
docker build -t paygorn:latest .
docker run -p 8000:7860 paygorn:latest

# Deploy to Hugging Face Spaces
openenv push --repo-id your-username/paygorn

Recommended Workflow for HF Spaces

Pre-compile the .so locally before pushing:

python build_engine.py          # Produces hft_auditor.cpXXX-linux_x86_64.so
git add hft_auditor*.so
git commit -m "Add pre-compiled engine binary"
openenv push

The Docker builder will detect the pre-compiled binary and skip the expensive CMake step entirely.


πŸ”§ Build Options

# Standard Windows build (Visual Studio 17 2022)
python build_engine.py

# Standard Linux build (Ninja / Make)
python build_engine.py

# Memory-safe build (-O1, single threaded): for HF Spaces / CI
python build_engine.py --docker-safe

# Rebuild from scratch (cleans build/ directory first)
python build_engine.py   # always cleans first

πŸ“Š Reward Function Design

The reward function encodes the asymmetric cost structure of real HFT compliance:

R(action, truth) = {
    +1.0   if action=FLAG   AND truth=ANOMALY   β†’ True  Positive (TP)
    +0.5   if action=PASS   AND truth=SAFE      β†’ True  Negative (TN)
    -0.1   if action=FLAG   AND truth=SAFE      β†’ False Positive (FP) 
     0.0   if action=PASS   AND truth=ANOMALY   β†’ False Negative (FN)
}

Why zero for FN? A zero reward means the agent gains nothing from missing a real anomaly, which acts as an implicit penalty when normalized against the opportunity cost of the missed TP. Combined with the asymmetric discount factor in PPO's GAE, this creates strong pressure to avoid FNs without requiring a separate penalty term.


🀝 OpenEnv Compliance

ETS is fully compliant with the OpenEnv standard. The environment exposes:

  • reset() β†’ AuditorObservation β€” initializes and returns a clean state
  • step(AuditorAction) β†’ AuditorObservation β€” processes one decision batch
  • SUPPORTS_CONCURRENT_SESSIONS = True β€” multiple WebSocket clients can connect simultaneously
  • Three registered tasks in openenv.yaml: anomaly_detection_easy, anomaly_detection_medium, anomaly_detection_hard

<div align="center">

Built with ⚑ C++20, 🐍 Python 3.12, and 🧠 RL by the PayGorn team.

</div>