fraudshield-1 / SUBMISSION_CHECKLIST.md
DevikaJ2005's picture
Finalize RL-first environment and explorer UI
30533d1

FraudShield Submission Checklist

Updated: 2026-04-25

Submission links

  • Hugging Face Space: https://huggingface.co/spaces/DevikaJ2005/fraudshield-1
  • GitHub repository: https://github.com/DevikaJ2005/Fraudshield
  • Colab notebook path committed: notebooks/fraudshield_trl_colab.ipynb
  • Public Colab run verified end to end after GPU training
  • Final Hugging Face blog post URL or video/slides URL added to README.md

OpenEnv environment requirements

  • openenv.yaml present and aligned with the current environment
  • OpenEnv API endpoints implemented:
    • /health
    • /reset
    • /step
    • /state
    • /info
    • /tasks
    • /metadata
    • /schema
    • /mcp
  • Typed action, observation, reward, reset, step, and state models
  • Frozen snapshot committed in data/fraudshield_cases.json
  • Three graded tasks: easy, medium, hard
  • Partial observability with progressive evidence reveals
  • Investigation budget per case
  • Workflow-shaped reward design
  • Lightweight browser explorer at / for manual inspection

Local validation status

  • python inference.py
  • python validate_api.py
  • python -m openenv.cli validate .
  • python -m openenv.cli validate --url http://127.0.0.1:7860
    • local browser health checks passed, but the validator still timed out against the subprocess-hosted server on this machine
  • docker build -t fraudshield .
    • blocked by the local Docker Desktop / WSL engine state, not by the project code

Current baseline

Heuristic baseline (rule-based, not trained):

  • Easy: 0.9900
  • Medium: 0.3500
  • Hard: 0.7425
  • Final: 0.6942

Training deliverables

  • RL training notebook scaffold committed
  • Notebook structured for Colab + GPU workflow
  • Notebook supports LOCAL_MODEL_PATH evaluation after training
  • Real Colab training run completed
  • reward_curve.png committed
  • loss_curve.png committed
  • training_summary.json updated with trained results
  • README results section updated with trained-vs-heuristic comparison

Presentation deliverables

  • README rewritten around the RL environment story
  • HF_BLOG_DRAFT.md present
  • Deployment guide present
  • Final blog published on Hugging Face
  • Final screenshots captured from the explorer UI and/or API walkthrough

What is still needed from you

  • Run the Colab notebook with a GPU runtime
  • Add your Hugging Face token inside Colab so trained artifacts can be saved if you want
  • Publish the final blog or video/slides link
  • Share Space build logs with me if the HF deployment needs one more cleanup pass