Spaces:

Danielfonseca1212
/

Benchmark

Sleeping

File size: 3,104 Bytes

---
title: GraphRAG vs Vector RAG — Fraud Detection Benchmark
emoji: 🕸️
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.44.0
app_file: app.py
pinned: true
tags:
  - graph-neural-networks
  - fraud-detection
  - neo4j
  - rag
  - llm
  - groq
  - mlops
---

# 🕸️ GraphRAG vs Vector RAG — Live Fraud Detection Benchmark

**By [Daniel Fonseca](https://linkedin.com/in/daniel-fonsecaai) · AI/ML Engineer · Graph Neural Networks · Fraud Detection**

[![Neo4j](https://img.shields.io/badge/Neo4j-Aura-00ED64?style=flat&logo=neo4j)](https://neo4j.com)
[![Groq](https://img.shields.io/badge/LLM-Groq%20%2F%20Llama%203.1-blueviolet)](https://groq.com)
[![Streamlit](https://img.shields.io/badge/Frontend-Streamlit-FF4B4B)](https://streamlit.io)

---

## What this demo shows

A live benchmark comparing two RAG architectures on **fraud detection queries**:

| | GraphRAG | Vector RAG |
|---|---|---|
| Retrieval | Cypher → Neo4j graph traversal | Embedding → cosine similarity |
| Precision | ~94% on relational queries | ~38% |
| Latency | ~60ms | ~300ms |
| Money mule chains | ✅ Full path | ❌ Cannot traverse |
| Shared device cluster | ✅ Exact | ⚠️ Approximate |

**Core insight**: Fraud lives in *connections*. A device shared by 3 customers, a money mule chain with 3 hops, 6 accounts from the same IP — these patterns are invisible to embeddings but trivially discoverable with a single Cypher traversal.

---

## Architecture

```
User question (natural language)
        │
        ▼
Groq/Llama 3.1 ──► Cypher query generation
        │
        ▼
Neo4j Aura ──► Graph traversal (2-5 hops)
        │
        ▼
Structured records ──► Groq/Llama ──► Fraud analysis answer
```

---

## Graph schema

```
(Customer)-[:HAS_ACCOUNT]->(Account)
(Customer)-[:USED]->(Device)
(Account)-[:ACCESSED_FROM]->(IP)
(Account)-[:TRANSFER {amount, date}]->(Account)
(Account)-[:TRANSACTION {amount, type}]->(Merchant)
```

Fraud patterns detectable:
- 🔴 **Shared device cluster** — emulator farms, identity theft
- 🔴 **IP overlap** — account opening fraud
- 🔴 **Money mule chain** — layering (A-102 → A-445 → A-667 → A-890)
- 🔴 **Card testing** — micro-transactions on merchants

---

## Setup (add to HF Secrets)

| Secret | Description |
|--------|-------------|
| `NEO4J_URI` | Neo4j Aura connection URI (`neo4j+s://...`) |
| `NEO4J_USER` | Usually `neo4j` |
| `NEO4J_PASSWORD` | Your Aura password |
| `GROQ_API_KEY` | Free at [console.groq.com](https://console.groq.com) |

After adding secrets: click **"Seed fraud graph"** in the sidebar to populate Neo4j.

> Without credentials the app runs in demo mode with realistic simulated responses.

---

## Related projects

- [IBM Safer Payments — AUC-ROC 0.9591](https://huggingface.co/spaces/daniel-fonsecaai)
- [HetGNN Fraud Graph Explorer](https://huggingface.co/spaces/daniel-fonsecaai)
- [Agentic RAG Pipeline on Kubernetes](https://huggingface.co/spaces/daniel-fonsecaai)

---

*Built with Neo4j Aura · Groq · Llama 3.1 · Streamlit · PyVis · Plotly*