---
title: Bitcoin Abuse Scoring (GAT / GATv2)
emoji: 🧭
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---

# Bitcoin Abuse Scoring (GAT / GATv2) — Hugging Face Space

This Space builds an **ego-subgraph** from a given Bitcoin transaction hash (`k` steps backward & forward), then runs **two pretrained GNN models** (GAT baseline & GATv2 enhanced) trained on **Elliptic** to score whether the center transaction is _abuse_.

## ✅ Features

- Data sources (public JSON APIs, no scraping): `mempool.space` / `blockstream.info` (Esplora), fallback to `Blockchair` (optional key).
- Ego-subgraph expansion **k ∈ {1,2,3}** (both parents & children).
- Graph safeguards: `MAX_NODES` & `MAX_EDGES` to avoid explosion.
- Node features: degree stats, value sums/logs, counts, ratio, distance-to-center, block height.
- Standardized features (on-the-fly). If your model used different features/scaler, set `USE_FEATURE_ADAPTER=true` (default) — it inserts a `Linear` projection to the expected input dimension (165 by default).
- Two models are loaded from **Hugging Face Hub** with thresholds (via `thresholds.json` or fallback `0.5`).
- **Rate limit**: 20 requests/min globally (sliding window).
- Visualizations: **ego-graph (pyvis HTML)** & **histogram of scores** per model.
- CPU-only deployment on Spaces.

## 🔧 Configuration

Set these **Environment Variables** (Space → Settings → Variables):

```
HF_GAT_BASELINE_REPO=org/name_gat_baseline
HF_GATV2_REPO=org/name_gatv2

# (Optional overrides)
IN_CHANNELS=165
HIDDEN_CHANNELS=128
HEADS=8
NUM_BLOCKS=2
DROPOUT=0.5

DATA_PROVIDER=mempool    # mempool | blockstream | blockchair
HTTP_TIMEOUT=10
HTTP_RETRIES=2
MAX_NODES=5000
MAX_EDGES=15000
USE_FEATURE_ADAPTER=true
DEFAULT_THRESHOLD=0.5
QUEUE_CONCURRENCY=2
BLOCKCHAIR_API_KEY=
```

Each model repo should contain:

- `model.pt` — PyTorch Geometric weights.
- (optional) `thresholds.json` with a key like `{"threshold": 0.42}`.
- (optional) `scaler.joblib` if you want to reuse the training scaler.

## 📦 API Usage in App

- `GET /api/tx/{txid}` and `GET /api/tx/{txid}/outspends` (Esplora).
- `GET /bitcoin/dashboards/transaction/{txid}` (Blockchair).

All calls have **timeouts & retries** and use a small **in-memory cache**.

## 🚦 Rate Limiting

Global limit `20 req/min` across the app (sliding window). Exceeding returns `Rate limit exceeded (20 req/min)`.

## 🧪 Acceptance Criteria

- Enter a valid tx hash & `k=2` → ego-graph is built, both models run, and the app displays:
  - `probability`, `threshold`, `label` for **GAT** and **GATv2**,
  - counts of nodes/edges and notes (e.g., _FeatureAdapter used_).
- Ego-graph renders with center highlighted; tooltips show txid and score.
- If the first provider fails, the app falls back.
- If graph exceeds safeguards, the app stops expansion and warns in logs (but still infers with what it has).

## ⚠️ Notes

- **Domain shift**: Features from on-chain crawls can differ from Elliptic; use the adapter and consider fine-tuning for production.
- Public APIs have their own rate limits — this app is conservative with requests, but heavy usage may still hit external limits.
- Input is validated to be a 64-hex txid. No arbitrary URLs are accepted.