Benchmark / README.md
Danielfonseca1212's picture
Update README.md
4cfd9c3 verified
---
title: GraphRAG vs Vector RAG Fraud Detection Benchmark
emoji: 🕸️
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.44.0
app_file: app.py
pinned: true
tags:
- graph-neural-networks
- fraud-detection
- neo4j
- rag
- llm
- groq
- mlops
---
# 🕸️ GraphRAG vs Vector RAG — Live Fraud Detection Benchmark
**By [Daniel Fonseca](https://linkedin.com/in/daniel-fonsecaai) · AI/ML Engineer · Graph Neural Networks · Fraud Detection**
[![Neo4j](https://img.shields.io/badge/Neo4j-Aura-00ED64?style=flat&logo=neo4j)](https://neo4j.com)
[![Groq](https://img.shields.io/badge/LLM-Groq%20%2F%20Llama%203.1-blueviolet)](https://groq.com)
[![Streamlit](https://img.shields.io/badge/Frontend-Streamlit-FF4B4B)](https://streamlit.io)
---
## What this demo shows
A live benchmark comparing two RAG architectures on **fraud detection queries**:
| | GraphRAG | Vector RAG |
|---|---|---|
| Retrieval | Cypher → Neo4j graph traversal | Embedding → cosine similarity |
| Precision | ~94% on relational queries | ~38% |
| Latency | ~60ms | ~300ms |
| Money mule chains | ✅ Full path | ❌ Cannot traverse |
| Shared device cluster | ✅ Exact | ⚠️ Approximate |
**Core insight**: Fraud lives in *connections*. A device shared by 3 customers, a money mule chain with 3 hops, 6 accounts from the same IP — these patterns are invisible to embeddings but trivially discoverable with a single Cypher traversal.
---
## Architecture
```
User question (natural language)
Groq/Llama 3.1 ──► Cypher query generation
Neo4j Aura ──► Graph traversal (2-5 hops)
Structured records ──► Groq/Llama ──► Fraud analysis answer
```
---
## Graph schema
```
(Customer)-[:HAS_ACCOUNT]->(Account)
(Customer)-[:USED]->(Device)
(Account)-[:ACCESSED_FROM]->(IP)
(Account)-[:TRANSFER {amount, date}]->(Account)
(Account)-[:TRANSACTION {amount, type}]->(Merchant)
```
Fraud patterns detectable:
- 🔴 **Shared device cluster** — emulator farms, identity theft
- 🔴 **IP overlap** — account opening fraud
- 🔴 **Money mule chain** — layering (A-102 → A-445 → A-667 → A-890)
- 🔴 **Card testing** — micro-transactions on merchants
---
## Setup (add to HF Secrets)
| Secret | Description |
|--------|-------------|
| `NEO4J_URI` | Neo4j Aura connection URI (`neo4j+s://...`) |
| `NEO4J_USER` | Usually `neo4j` |
| `NEO4J_PASSWORD` | Your Aura password |
| `GROQ_API_KEY` | Free at [console.groq.com](https://console.groq.com) |
After adding secrets: click **"Seed fraud graph"** in the sidebar to populate Neo4j.
> Without credentials the app runs in demo mode with realistic simulated responses.
---
## Related projects
- [IBM Safer Payments — AUC-ROC 0.9591](https://huggingface.co/spaces/daniel-fonsecaai)
- [HetGNN Fraud Graph Explorer](https://huggingface.co/spaces/daniel-fonsecaai)
- [Agentic RAG Pipeline on Kubernetes](https://huggingface.co/spaces/daniel-fonsecaai)
---
*Built with Neo4j Aura · Groq · Llama 3.1 · Streamlit · PyVis · Plotly*