--- title: Bitcoin Abuse Scoring (GAT / GATv2) emoji: ๐Ÿงญ colorFrom: blue colorTo: red sdk: gradio sdk_version: "4.44.0" app_file: app.py pinned: false --- # Bitcoin Abuse Scoring (GAT / GATv2) โ€” Hugging Face Space This Space builds an **ego-subgraph** from a given Bitcoin transaction hash (`k` steps backward & forward), then runs **two pretrained GNN models** (GAT baseline & GATv2 enhanced) trained on **Elliptic** to score whether the center transaction is _abuse_. ## โœ… Features - Data sources (public JSON APIs, no scraping): `mempool.space` / `blockstream.info` (Esplora), fallback to `Blockchair` (optional key). - Ego-subgraph expansion **k โˆˆ {1,2,3}** (both parents & children). - Graph safeguards: `MAX_NODES` & `MAX_EDGES` to avoid explosion. - Node features: degree stats, value sums/logs, counts, ratio, distance-to-center, block height. - Standardized features (on-the-fly). If your model used different features/scaler, set `USE_FEATURE_ADAPTER=true` (default) โ€” it inserts a `Linear` projection to the expected input dimension (165 by default). - Two models are loaded from **Hugging Face Hub** with thresholds (via `thresholds.json` or fallback `0.5`). - **Rate limit**: 20 requests/min globally (sliding window). - Visualizations: **ego-graph (pyvis HTML)** & **histogram of scores** per model. - CPU-only deployment on Spaces. ## ๐Ÿ”ง Configuration Set these **Environment Variables** (Space โ†’ Settings โ†’ Variables): ``` HF_GAT_BASELINE_REPO=org/name_gat_baseline HF_GATV2_REPO=org/name_gatv2 # (Optional overrides) IN_CHANNELS=165 HIDDEN_CHANNELS=128 HEADS=8 NUM_BLOCKS=2 DROPOUT=0.5 DATA_PROVIDER=mempool # mempool | blockstream | blockchair HTTP_TIMEOUT=10 HTTP_RETRIES=2 MAX_NODES=5000 MAX_EDGES=15000 USE_FEATURE_ADAPTER=true DEFAULT_THRESHOLD=0.5 QUEUE_CONCURRENCY=2 BLOCKCHAIR_API_KEY= ``` Each model repo should contain: - `model.pt` โ€” PyTorch Geometric weights. - (optional) `thresholds.json` with a key like `{"threshold": 0.42}`. - (optional) `scaler.joblib` if you want to reuse the training scaler. ## ๐Ÿ“ฆ API Usage in App - `GET /api/tx/{txid}` and `GET /api/tx/{txid}/outspends` (Esplora). - `GET /bitcoin/dashboards/transaction/{txid}` (Blockchair). All calls have **timeouts & retries** and use a small **in-memory cache**. ## ๐Ÿšฆ Rate Limiting Global limit `20 req/min` across the app (sliding window). Exceeding returns `Rate limit exceeded (20 req/min)`. ## ๐Ÿงช Acceptance Criteria - Enter a valid tx hash & `k=2` โ†’ ego-graph is built, both models run, and the app displays: - `probability`, `threshold`, `label` for **GAT** and **GATv2**, - counts of nodes/edges and notes (e.g., _FeatureAdapter used_). - Ego-graph renders with center highlighted; tooltips show txid and score. - If the first provider fails, the app falls back. - If graph exceeds safeguards, the app stops expansion and warns in logs (but still infers with what it has). ## โš ๏ธ Notes - **Domain shift**: Features from on-chain crawls can differ from Elliptic; use the adapter and consider fine-tuning for production. - Public APIs have their own rate limits โ€” this app is conservative with requests, but heavy usage may still hit external limits. - Input is validated to be a 64-hex txid. No arbitrary URLs are accepted.