Spaces:

thanhphxu
/

MLGraph-Bitcoin-GAD

Sleeping

App Files Files Community

MLGraph-Bitcoin-GAD / README.md

thanhphxu

Upload folder using huggingface_hub

d7b8193 verified 6 months ago

preview code

raw

history blame contribute delete

3.28 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: Bitcoin Abuse Scoring (GAT / GATv2)
emoji: 🧭
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false

Bitcoin Abuse Scoring (GAT / GATv2) — Hugging Face Space

This Space builds an ego-subgraph from a given Bitcoin transaction hash (k steps backward & forward), then runs two pretrained GNN models (GAT baseline & GATv2 enhanced) trained on Elliptic to score whether the center transaction is abuse.

✅ Features

Data sources (public JSON APIs, no scraping): mempool.space / blockstream.info (Esplora), fallback to Blockchair (optional key).
Ego-subgraph expansion k ∈ {1,2,3} (both parents & children).
Graph safeguards: MAX_NODES & MAX_EDGES to avoid explosion.
Node features: degree stats, value sums/logs, counts, ratio, distance-to-center, block height.
Standardized features (on-the-fly). If your model used different features/scaler, set USE_FEATURE_ADAPTER=true (default) — it inserts a Linear projection to the expected input dimension (165 by default).
Two models are loaded from Hugging Face Hub with thresholds (via thresholds.json or fallback 0.5).
Rate limit: 20 requests/min globally (sliding window).
Visualizations: ego-graph (pyvis HTML) & histogram of scores per model.
CPU-only deployment on Spaces.

🔧 Configuration

Set these Environment Variables (Space → Settings → Variables):

HF_GAT_BASELINE_REPO=org/name_gat_baseline
HF_GATV2_REPO=org/name_gatv2

# (Optional overrides)
IN_CHANNELS=165
HIDDEN_CHANNELS=128
HEADS=8
NUM_BLOCKS=2
DROPOUT=0.5

DATA_PROVIDER=mempool    # mempool | blockstream | blockchair
HTTP_TIMEOUT=10
HTTP_RETRIES=2
MAX_NODES=5000
MAX_EDGES=15000
USE_FEATURE_ADAPTER=true
DEFAULT_THRESHOLD=0.5
QUEUE_CONCURRENCY=2
BLOCKCHAIR_API_KEY=

Each model repo should contain:

model.pt — PyTorch Geometric weights.
(optional) thresholds.json with a key like {"threshold": 0.42}.
(optional) scaler.joblib if you want to reuse the training scaler.

📦 API Usage in App

GET /api/tx/{txid} and GET /api/tx/{txid}/outspends (Esplora).
GET /bitcoin/dashboards/transaction/{txid} (Blockchair).

All calls have timeouts & retries and use a small in-memory cache.

🚦 Rate Limiting

Global limit 20 req/min across the app (sliding window). Exceeding returns Rate limit exceeded (20 req/min).

🧪 Acceptance Criteria

Enter a valid tx hash & k=2 → ego-graph is built, both models run, and the app displays:
- probability, threshold, label for GAT and GATv2,
- counts of nodes/edges and notes (e.g., FeatureAdapter used).
Ego-graph renders with center highlighted; tooltips show txid and score.
If the first provider fails, the app falls back.
If graph exceeds safeguards, the app stops expansion and warns in logs (but still infers with what it has).

⚠️ Notes

Domain shift: Features from on-chain crawls can differ from Elliptic; use the adapter and consider fine-tuning for production.
Public APIs have their own rate limits — this app is conservative with requests, but heavy usage may still hit external limits.
Input is validated to be a 64-hex txid. No arbitrary URLs are accepted.