Spaces:
Sleeping
Sleeping
| title: GraphStrike | |
| emoji: π΅οΈ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: mit | |
| tags: | |
| - reinforcement-learning | |
| - social-network | |
| - fraud-detection | |
| - openenv | |
| - llm-agent | |
| base_path: /web | |
| <br> | |
| <p align="center"> | |
| <img src="assets/logo.png" width="600"/> | |
| </p> | |
| <br> | |
| <p align="center"> | |
| <img src="https://img.shields.io/badge/Hugging%20Face-FFD21E?style=for-the-badge&logo=huggingface&logoColor=black"/> | |
| <img src="https://img.shields.io/badge/HF%20Spaces-FFBF00?style=for-the-badge&logo=huggingface&logoColor=black"/> | |
| <img src="https://img.shields.io/badge/FastAPI-009688?style=for-the-badge&logo=fastapi&logoColor=white"/> | |
| <img src="https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white"/> | |
| <img src="https://img.shields.io/badge/Gradio-F97316?style=for-the-badge&logo=gradio&logoColor=white"/> | |
| <img src="https://img.shields.io/badge/OpenEnv-4B5563?style=for-the-badge&logo=envato&logoColor=white"/> | |
| <img src="https://img.shields.io/badge/Amazon%20Bedrock-FF9900?style=for-the-badge&logo=amazonaws&logoColor=white"/> | |
| </p> | |
| <br> | |
| <h1 align="center"> | |
| </h1> | |
| <p align="center"> | |
| An OpenEnv-compatible reinforcement learning environment where an LLM agent must identify all 10 members of a coordinated fake account network hidden inside a synthetic social network. The agent learns via Reflexion and a dynamic hybrid rule/LLM policy , not via gradient updates or fine-tuning. | |
| <br /> | |
| </p> | |
| </p> | |
| <br> | |
| ## Theme | |
| **SUPPORT** | |
| ### Customer Service Agents | |
| Complex environment where agents resolve multi-step queries using external tools and APIs. | |
| ## Problem Statement | |
| **The task:** A social network contains fake accounts organised into a single coordinated ring of 10. The ring behaves in a coordinated way β same posting hour, same IP subnet, stolen celebrity photos, copy-paste bios. The agent must find all 10 by navigating a limited step budget, inspecting accounts, and flagging suspects. | |
| ## Proposed Solution | |
| An OpenEnv-compatible reinforcement learning environment where an LLM agent must identify all 10 members of a coordinated fake account ring hidden inside a synthetic social network. The agent learns via **Reflexion** and a **dynamic hybrid rule/LLM policy** β not via gradient updates or fine-tuning. | |
| --- | |
| ## Novelty Highlights | |
| - **Adaptive Hybrid Intelligence (Rules + LLM):** Unlike static ensembles, GraphStrike dynamically blends deterministic rules and LLM reasoning using a trust gate, shifting control as performance improves. | |
| - **Learning Without Fine-Tuning:** Instead of updating model weights, the agent learns through Reflexion lessons and best-trajectory memory injected into future prompts. | |
| - **Graph-First Detection Pipeline:** Detection is not account-by-account only; it uses cascade effects, neighbor propagation, and multi-hop graph expansion to uncover coordinated rings. | |
| - **Math-Grounded Decision Control:** Risk composition, trust calibration, and grader alignment are formula-driven, making behavior interpretable and reproducible. | |
| - **Adversarial Evasion Benchmarking:** Hard-mode includes timed evasion events, so success reflects robustness under disruption rather than overfitting to static patterns. | |
| - **Safety-Net by Design:** High-confidence rule overrides prevent catastrophic LLM errors while preserving LLM flexibility for strategic exploration. | |
| --- | |
| ## Performance Summary | |
| We evaluate GraphStrike's hybrid rule/LLM policy across multiple *frontier models to measure how well each model handles the investigation task. All runs use | |
| the same inference pipeline (`inference.py`) with identical system prompts and structured logging. Each model ran: (1) seed=0 on all 3 tasks, and | |
| (2) seeds 0-2 on all 3 tasks for variance measurement.* | |
| **Seed=0 scores (single episode per task):** | |
| <p align="center"> | |
| <img src="images/table1.png" alt="Model Performance Table" width="1600"/> | |
| </p> | |
| <br> | |
| **3-seed variance scores (mean across seeds 0, 1, 2):** | |
| <p align="center"> | |
| <img src="images/table2.png" alt="Model Performance Table" width="1600"/> | |
| </p> | |
| <br> | |
| **Rule-Based Baseline (no LLM, deterministic)** | |
| <p align="center"> | |
| <img src="images/table3.png" alt="Model Performance Table" width="1600"/> | |
| </p> | |
| <br> | |
| --- | |
| ## Table of Contents | |
| 1. [What This Is](#1-what-this-is) | |
| 2. [The Problem: How Fake Detection Actually Works](#2-the-problem-how-fake-detection-actually-works) | |
| 3. [Synthetic Data Generation](#3-synthetic-data-generation) | |
| 4. [Data Model](#4-data-model) | |
| 5. [The RL Environment](#5-the-rl-environment) | |
| 6. [Risk Scoring Mathematics](#6-risk-scoring-mathematics) | |
| 8. [The LLM Policy (Qwen3 via Bedrock)](#8-the-llm-policy-qwen3-via-bedrock) | |
| 9. [Reflexion β How the Agent Learns](#9-reflexion--how-the-agent-learns) | |
| 10. [Hybrid Policy β The Novel Contribution](#10-hybrid-policy--the-novel-contribution) | |
| 11. [Training Loop End-to-End](#11-training-loop-end-to-end) | |
| 12. [API Reference](#12-api-reference) | |
| 13. [Docker Deployment](#13-docker-deployment) | |
| 14. [Submission Requirements](#14-submission-requirements) | |
| 15. [Verification & Validation](#15-verification--validation) | |
| --- | |
| ## 1. What is this !? | |
| This is an **OpenEnv hackathon** submission. OpenEnv is a framework for building RL environments with a standard microservice interface (`/reset`, `/step`, `/state`) so that any agent implementation can plug in. | |
| **What makes this non-trivial:** | |
| - The network is large (50β1000 accounts depending on difficulty). | |
| - Fake accounts are mixed with innocent high-signal "decoy" accounts. | |
| - In hard mode, the gang actively evades β dropping intra-gang follows, renaming profiles β while the agent is mid-investigation. | |
| - The agent cannot see the full network upfront: it must explore via INSPECT and INVESTIGATE_NETWORK actions, spending steps to reveal information. | |
| **What makes the learning novel:** | |
| - The LLM (inference via AWS Bedrock) cannot be fine-tuned β it is a black-box API. | |
| - The agent learns via **Reflexion**: post-episode lessons are written back into memory and injected into every future prompt. | |
| - A **dynamic hybrid policy** (Ξ±-weighted) blends the LLM with a deterministic rule engine, with the blend weight Ξ± updating based on recent win rate. Rules dominate early; the LLM takes over as it proves itself. | |
| ### System Architecture | |
|  | |
| --- | |
| ## 2. The Problem: How Fake Detection Actually Works | |
| A real-world fake account detector does **not** read post content. Detection relies on three categories of signals computed from metadata: | |
| ### Signal Hierarchy (Node -> Behavioral -> Graph) | |
|  | |
| - **Node signals (offline):** content fingerprints like photo reuse, bio-template similarity, and comment repetition provide the first suspicion layer. | |
| - **Behavioral signals (temporal/device):** coordinated posting hour, account-age clustering, and shared IP subnet add stronger gang-level evidence. | |
| - **Graph signals (live at INSPECT):** mutual follows, flagged-neighbor growth, and cluster alignment are hardest to evade, so they carry the highest weight in risk scoring. | |
| - **False-positive control:** high-legitimacy hubs (for example celebrities) are down-weighted through hub-legitimacy discounting. | |
| --- | |
| ## 3. Synthetic Data Generation | |
| **File:** `server/generator.py` | |
| Episodes are generated deterministically by seed. 150 episodes are pre-generated (50 per task) and cached as JSON files in `episodes/`. | |
| ### Network Composition | |
| | Task | Network size | Gang | Decoys | Real | Max steps | | |
| |---|---|---|---|---|---| | |
| | easy | 50 | 10 | 0 | 40 | 30 | | |
| | medium | 200 | 10 | 20 | 170 | 50 | | |
| | hard | 1000 | 10 | 50 | 940 | 80 | | |
| - **Gang accounts:** All 10 share `base_age` (same creation week), tightly clustered `avg_post_hour`, high `photo_reuse_score`/`bio_template_score`, `comment_repeat_score` in [0.60, 0.90], `ip_cluster_id = "ip_gang_{seed}"`, and dense intra-gang follow edges (density 0.60β0.80). | |
| - **Real accounts:** Log-normal follower distributions, unique IP clusters, low fake scores. | |
| - **Decoy accounts** (medium/hard): Real accounts with elevated fraud scores (0.20β0.40 range) β they look suspicious but are NOT gang members and penalise reckless flagging. | |
| - **Celebrity accounts** (2 per episode): 100kβ5M followers, very low fake scores, high `hub_legitimacy_score`. | |
| - **Zero-edge isolates** (2 per episode): No edges β test whether the agent wastes steps on disconnected nodes. | |
| --- | |
| ## 4. Data Model | |
| **File:** `models.py` | |
| ### ActionType | |
| | Value | Cost | Effect | | |
| |---|---|---| | |
| | `inspect` | 1 step | Reveals full `AccountProfile` + follow list | | |
| | `investigate_network` | 2 steps | Expands 2 hops; reveals account IDs only | | |
| | `flag` | 0 steps | Marks account as gang member; triggers SUSPECT cascade | | |
| | `unflag` | 0 steps | Removes flag; clears CONFIRMED_FAKE status | | |
| | `submit` | 0 steps | Ends episode; triggers scoring | | |
| ### AccountProfile β key fields | |
| | Category | Fields | | |
| |---|---| | |
| | Raw counts | `follower_count`, `following_count`, `post_count` | | |
| | Temporal | `avg_post_hour`, `account_age_days` | | |
| | Content pipeline (0β1) | `photo_reuse_score`, `bio_template_score`, `comment_repeat_score` | | |
| | IP/device | `shared_ip_count`, `ip_cluster_id` | | |
| | Graph (live at INSPECT) | `mutual_follow_rate`, `flagged_neighbor_count`, `avg_neighbor_photo_reuse`, `post_hour_cluster_score` | | |
| | Risk breakdown | `fake_risk_score`, `node_risk`, `behavior_risk`, `graph_risk`, `hub_legitimacy_score` | | |
| | Evasion/status | `name_change_count`, `status` (NORMAL/SUSPECT/CONFIRMED_FAKE) | | |
| ### FakeGangObservation β what the agent sees each step | |
| `done`, `reward`, `visible_accounts`, `visible_account_ids`, `flagged_ids`, `inspected_ids`, `suspect_ids`, `graph_edges`, `steps_remaining`, `evasion_triggered`, `evasion_count`, `task`, `message` | |
| --- | |
| ## 5. The RL Environment | |
| **File:** `server/environment.py` | |
| ### Episode Lifecycle & Action Mechanics | |
|  | |
| **FLAG cascade (dual):** When FLAG(X) is called β (1) every visible account that X follows becomes SUSPECT via the follow-graph, and (2) every visible account sharing X's `ip_cluster_id` becomes SUSPECT. Gang members share `ip_gang_{seed}`; real accounts have unique IPs β zero false positives. | |
| ### Reward Function | |
| ``` | |
| base_reward = tpΓ1.0 β fpΓ0.5 β fnΓ0.3 | |
| Win condition: | |
| easy/medium: recall β₯ 0.8 AND precision β₯ 0.7 | |
| hard: recall β₯ 0.9 AND precision β₯ 0.8 | |
| Bonuses: | |
| +5.0 full win | |
| +3.0 perfect recall | |
| +2.0 partial win (high recall, low precision) | |
| +1.0 efficiency (SUBMIT with β₯50% steps remaining) | |
| β1.0 per evasion event (hard mode) | |
| β2.0 forced submit (ran out of steps) | |
| ``` | |
| ### Evasion (hard mode) | |
| - **`unfollow_intragang`:** 30% of intra-gang edges randomly removed mid-investigation β destroys graph signal. Fires 4 times (steps 15, 30, 45, 60). | |
| - **`rename_count`:** Random gang members get `name_change_count += 1` β a visual evasion signal. | |
| --- | |
| ## 6. Risk Scoring Mathematics | |
| **File:** `server/scoring.py` β all functions are stateless and deterministic. | |
|  | |
|  | |
|  | |
| --- | |
| ## 8. The LLM Policy (Qwen3 via Bedrock) | |
| **File:** `agent/policy.py` | |
| **Model:** `qwen.qwen3-next-80b-a3b` via AWS Bedrock Converse API (`maxTokens=512, temperature=0.4`) | |
| ### Prompt Structure | |
| Every step, the policy builds a prompt from three components: | |
| ``` | |
| [reflections from past episodes] β grows richer every episode | |
| [best trajectory few-shot example] β best win ever, showing the full action log | |
| βββ CURRENT CASE βββ | |
| [formatted observation] β status badges, risk scores, suspect list | |
| What is your next action? | |
| ``` | |
| Accounts in the observation are **sorted by `fake_risk_score` descending**, with status badges prepended. `fnbr=N(!)` highlights when `flagged_neighbor_count > 0`; `[HUB?]` warns the LLM not to flag high-legitimacy accounts. | |
| ### Required Response Format | |
| ```xml | |
| <thinking> | |
| Reasoning β which account is most suspicious and why. | |
| </thinking> | |
| <action> | |
| INSPECT acc_0041 | |
| </action> | |
| ``` | |
| If parsing fails, a heuristic fallback inspects the highest-scored uninspected account. Retries use exponential backoff (1s, 2s, 4s) up to 3 attempts. | |
| --- | |
| ## 9. Reflexion β How the Agent Learns | |
| **Files:** `agent/reflection.py`, `agent/memory.py` | |
| The agent **cannot** update Qwen3's weights β Bedrock is a black-box API. Instead, it learns via **Reflexion**: post-episode lessons are written as text and injected into future prompts. | |
| ### Reflexion Learning Loop | |
|  | |
| ``` | |
| Episode N: | |
| 1. LLM acts using: system_prompt + reflections[last 4] + best_trajectory | |
| 2. Episode ends β WIN or LOSS | |
| 3. Post-episode: | |
| LOSS β generate_reflection(action_log, outcome) β lesson stored | |
| WIN β save trajectory if better reward + generate_success_reflection | |
| Episode N+1: | |
| β last 4 reflections + best win trajectory injected into prompt | |
| β LLM has learned from its past | |
| ``` | |
| **Example generated reflection:** | |
| > *"The starting accounts were all real; I wasted 8 steps inspecting low-signal nodes before pivoting. When photo_reuse and bio_template are both below 0.3 after 3 inspections, immediately use INVESTIGATE_NETWORK to jump to a different graph region."* | |
| All memory persists in a Docker volume (`memory/`) across container restarts β reflections, best trajectories, win history, and Ξ± values per task. | |
| --- | |
| ## 10. Hybrid Policy β The Novel Contribution | |
| **File:** `agent/hybrid_policy.py` | |
| **Key insight:** A new LLM agent starts dumb but improves over time. A rule engine is always consistent but cannot adapt. The hybrid policy exploits both β rules provide a safety net early while the LLM builds its track record; once the LLM proves itself, rules step back. | |
| ### Architecture | |
|  | |
| ### Alpha (Ξ±): The Trust Weight | |
| Ξ± is a per-task value in [0.20, cap] representing current trust in the LLM: | |
| ``` | |
| reflection_factor = min(1.0, n_reflections / 4.0) | |
| raw = 0.20 + reflection_factor Γ (0.80 Γ recent_win_rate + 0.12) | |
| Ξ± = clamp(raw, 0.20, cap) | |
| ``` | |
| | Task | Ξ± cap | Rationale | | |
| |---|---|---| | |
| | easy | 0.50 | Rule engine alone achieves ~91% β LLM should assist, not override | | |
| | medium | 0.70 | Decoys require some LLM judgment, but cascade must stay | | |
| | hard | 0.85 | LLM needs latitude for evasion adaptation, but safety rules remain | | |
| **Alpha trajectory over training (easy task, cap=0.50):** | |
| | Episode | Win rate | Reflections | Ξ± (capped) | | |
| |---|---|---|---| | |
| | 1 | 0% | 0 | 0.20 | | |
| | 5 | 20% | 4 | 0.48 | | |
| | 10 | 50% | 9 | **0.50** | | |
| | 20 | 80% | 19 | **0.50** | | |
| <br> | |
|  | |
| ### Rule Confidence Levels | |
| | Situation | Action | Confidence | | |
| |---|---|---| | |
| | Steps remaining = 0 | SUBMIT | 1.00 | | |
| | Uninspected SUSPECT accounts exist | INSPECT suspects[0] | 0.95 | | |
| | `fake_risk β₯ 0.85` | FLAG that account | 0.95 | | |
| | `fake_risk` in [threshold, 0.85) | FLAG that account | 0.70+ | | |
| | 10 accounts already flagged | SUBMIT | 0.85 | | |
| | Steps remaining β€ 3 | SUBMIT | 0.90 | | |
| | Uninspected accounts available | INSPECT top candidate | 0.30 | | |
| At **Ξ±=0.20** (early): rules dominate (~90% of decisions). At **Ξ±=0.50** (moderate): LLM controls exploration; rules control safety. At **Ξ±=0.85** (high): LLM controls most decisions; rules only override forced submits and uninspected suspects. | |
| Ξ± is saved to `memory/alpha_{task}.json` and persists across Docker restarts β the agent doesn't reset to 0.20 every time. | |
| --- | |
| ## 11. Training Loop End-to-End | |
| **File:** `train.py` | |
| ### Curriculum | |
| | Phase | Episodes | Task | Goal | | |
| |---|---|---|---| | |
| | 1 | 1β20 | easy | Learn basic signal thresholds, build first reflections | | |
| | 2 | 21β35 | medium | Handle decoys, learn evasion response | | |
| | 3 | 36β50 | hard | Feature-only detection, persistent evasion | | |
| Seeds rotate deterministically: `seed = (episode_num + task_offset) % 50` | |
| ### Per-Episode Flow | |
| ``` | |
| for ep in range(n_episodes): | |
| 1. DETERMINE TASK curriculum_task(ep) or fixed task | |
| 2. COMPUTE ALPHA compute_alpha(win_rate, n_reflections, task) | |
| 3. LOAD CONTEXT last 4 reflections + best win trajectory | |
| 4. RUN EPISODE while not obs.done: | |
| blend(rule_action, llm_action, rule_conf, Ξ±) | |
| β obs = env.step(final) | |
| 5. POST-EPISODE record_win β update Ξ± β generate reflection | |
| 6. LOG task | win/loss | reward | recall | precision | Ξ± | modes | |
| ``` | |
| Episode metrics (flushed to `runs/metrics.jsonl` every 5 episodes) include: `episode`, `task`, `won`, `reward`, `recall`, `precision`, `steps_used`, `alpha_used`, `mode_agree`, `mode_rule`, `mode_llm`, `n_reflections_used`. | |
| You can watch the transition: early episodes have high `rule` counts; later episodes have high `agree` counts (LLM learned to make the same decisions as the rules, but also brings strategic reasoning the rules can't). | |
| --- | |
| ## 12. API Reference | |
| **File:** `server/app.py` | |
| | Endpoint | Method | Description | | |
| |---|---|---| | |
| | `/health` | GET | `{"status": "healthy"}` | | |
| | `/tasks` | GET | Task list + `action_schema` + `score_range: [0.0, 1.0]` | | |
| | `/reset` | POST | Accepts `{task, seed}` β returns initial observation | | |
| | `/step` | POST | Accepts any `FakeGangAction` β returns updated observation | | |
| | `/state` | GET | Current episode metadata (step count, task, score) | | |
| | `/grader` | GET | Normalised [0.0, 1.0] score after SUBMIT | | |
| | `/baseline` | POST | Runs rule-based agent on all 3 tasks, returns scores | | |
| **Baseline performance:** | |
| | Task | Seed=0 score | Win rate (50 seeds) | Mean score (50 seeds) | | |
| |---|---|---|---| | |
| | easy | 0.91 | 100% | ~0.91 | | |
| | medium | 0.906 | 84% | ~0.77 | | |
| | hard | 0.9038 | 52% | ~0.47 | | |
| --- | |
| ## 13. Docker Deployment | |
| ```bash | |
| # Build | |
| docker build -f server/Dockerfile -t graphstrike . | |
| # Run | |
| docker run -it \ | |
| -e AWS_ACCESS_KEY_ID=your_key \ | |
| -e AWS_SECRET_ACCESS_KEY=your_secret \ | |
| -v $(pwd)/memory:/app/memory \ | |
| -v $(pwd)/runs:/app/runs \ | |
| -p 8000:8000 \ | |
| graphstrike | |
| ``` | |
| The `memory/` and `runs/` volumes preserve all learning between container restarts. | |
| ### Environment Variables | |
| | Variable | Default | Description | | |
| |---|---|---| | |
| | `AWS_ACCESS_KEY_ID` | (required) | For Bedrock/Qwen3 access | | |
| | `AWS_SECRET_ACCESS_KEY` | (required) | For Bedrock/Qwen3 access | | |
| | `AWS_DEFAULT_REGION` | `us-east-1` | Bedrock region | | |
| | `TRAIN_TASK` | (curriculum) | Fix to `easy`/`medium`/`hard` | | |
| | `TRAIN_EPISODES` | `50` | Total training episodes | | |
| | `TRAIN_TEMP` | `0.4` | LLM sampling temperature | | |
| | `TRAIN_VERBOSE` | `0` | Set `1` for per-step action logging | | |
| | `SERVER_PORT` | `8000` | FastAPI port | | |
| ### Startup Sequence (`run.sh`) | |
| ``` | |
| 1. Validate AWS credentials | |
| 2. python server/generator.py β generates 150 episode JSON files | |
| 3. uvicorn server.app:app β starts the environment server | |
| 4. Health check polling β waits until /health responds | |
| 5. python train.py β runs the full training loop | |
| ``` | |
| --- | |
| ### Full HTTP validation | |
| ```bash | |
| python3 -m uvicorn server.app:app --port 8001 & | |
| sleep 3 | |
| python3 validate.py --url http://localhost:8001 | |
| # Expected: Results: 24/24 passed β all OK | |
| ``` | |
| ### Deployed Endpoint Verification | |
| ```bash | |
| curl https://pandago-graphstrike.hf.space/health | |
| # β {"status": "healthy"} | |
| curl https://pandago-graphstrike.hf.space/tasks | |
| # β {"tasks": ["easy","medium","hard"], "action_schema": {...}, "score_range": [0.0, 1.0]} | |
| curl -X POST https://pandago-graphstrike.hf.space/baseline | |
| # β {"scores": {"easy": 0.91, "medium": 0.906, "hard": 0.9038}, "agent": "rule_based"} | |
| ``` | |
| --- | |
|  | |
| ## Developed with β€οΈ by Team ComputeXOR | |
| ### { | |
| ### [Sai Nivedh](https://github.com/SaiNivedh26) , | |
| ### [Charuvarthan](https://github.com/Charuvarthan-T) , | |
| ### [Sajeev](https://github.com/SajeevSenthil) | |
| ### } | |