--- license: apache-2.0 library_name: scikit-learn tags: - cybersecurity - blockchain - network-security - validator-security - anomaly-detection - intrusion-detection - ddos - cve datasets: - NullRabbit/nr-bundles-public metrics: - roc_auc - f1 --- # nr-network-known-class-detector A binary **attack-vs-benign** detector for **blockchain-node network/resource attacks**, trained entirely on faithful reproductions of **publicly-disclosed** attacks. Every attack class in the corpus reproduces a specific public disclosure — a CVE, a GHSA, or a named third-party security audit — and each carries a `provenance.source_class` recording how public its sourcing is. Part of NullRabbit's work on **autonomous defence for decentralised networks** — *watch the outside of the perimeter*. > **STATUS: DIAGNOSTIC, not a deployment claim.** Trained on synthetic localnet reproductions (lab > fidelity), not real production traffic. See *Evaluation* and *Limitations*. ## Model description Given the network-layer signal of a short capture window against a blockchain node (packet-rate / size statistics from the pcap, and amplification / request-response / timing statistics from the RPC responses), the model emits a calibrated attack probability. It is one multi-family model over the `network-v1` feature manifold — it spans 9 chains, all public-sourced (this published cut is trained on `public-cve-replication` primitives only). ## Architecture - `HistGradientBoostingClassifier` + isotonic calibration (scikit-learn), NaN-native. - pcap aggregates + RPC-response aggregates, degenerate columns dropped by a per-fit robust-column guard. No host-load features (the containerised lab node is root-owned, so host metrics are unreadable). - Decision threshold 0.5 (calibrated). Inference is **scoreability-gated**: a record with no network signal (e.g. an economic/DeFi bundle) returns `scoreable=False` with no verdict. ## Training data — 47 public-CVE attack primitives, 9 chains, 1434 bundles **This is the public-CVE cut** (`public-cve-replication` only): 966 attack + 468 benign bundles (`pcap + responses + manifest`), 53 chain×primitive instances. Benign traffic exercises the **same methods / wire messages** the attacks abuse, at normal scale — so the model separates attack-*use* from benign-*use*, not message type. Every attack reproduces an external public disclosure with a `provenance.public_source` URL. (0 `original` primitives — NullRabbit's own measurement of vendor-acknowledged RPC amplification, for which no CVE exists — are held in the full corpus but **excluded from this published cut**, so the "trained entirely on public disclosures" claim above is literal.) | primitive | chain · layer | public source | source_class | |---|---|---|---| | `btc_addr_overflow_flood` | Bitcoin / Dogecoin / Litecoin · p2p | [CVE-2024-52919](https://bitcoincore.org/en/2025/04/28/disclose-cve-2024-52919/) | public-cve-replication | | `btc_alert_flood` | Bitcoin · p2p | [CVE-2016-10724](https://nvd.nist.gov/vuln/detail/CVE-2016-10724) | public-cve-replication | | `btc_blocktxn_double_fillblock` | Bitcoin · p2p | [CVE-2024-35202](https://bitcoincore.org/en/2024/10/08/disclose-blocktxn-crash/) | public-cve-replication | | `btc_bloom_divzero` | Bitcoin · p2p | [CVE-2013-5700](https://nvd.nist.gov/vuln/detail/CVE-2013-5700) | public-cve-replication | | `btc_cmpctblock_overflow` | Bitcoin · p2p | [CVE-2025-46597](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-46597/) | public-cve-replication | | `btc_cmpctblock_stall` | Bitcoin · p2p | [CVE-2024-52922](https://bitcoincore.org/en/2024/11/05/cb-stall-hindering-propagation/) | public-cve-replication | | `btc_getdata_flood` | Bitcoin · p2p | [CVE-2024-52920](https://bitcoincore.org/en/2024/07/03/disclose-getdata-cpu/) | public-cve-replication | | `btc_headers_genesis_spam` | Bitcoin · p2p | [CVE-2024-52916](https://bitcoincore.org/en/2024/07/03/disclose-header-spam/) | public-cve-replication | | `btc_headers_oom` | Bitcoin · p2p | [CVE-2019-25220](https://bitcoincore.org/en/2024/09/18/disclose-headers-oom/) | public-cve-replication | | `btc_inv_buffer_blowup` | Bitcoin · p2p | [CVE-2024-52915](https://bitcoincore.org/en/2024/07/03/disclose-inv-buffer-blowup/) | public-cve-replication | | `btc_inv_eviction_jam` | Bitcoin · p2p | [CVE-2024-52913](https://bitcoincore.org/en/2024/07/03/disclose_already_asked_for/) | public-cve-replication | | `btc_invalid_block_logfill` | Bitcoin · p2p | [CVE-2025-54605](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-54605/) | public-cve-replication | | `btc_invdos_flood` | Bitcoin · p2p | [CVE-2018-17145](https://invdos.net/) | public-cve-replication | | `btc_mutated_block` | Bitcoin · p2p | [CVE-2024-52921](https://bitcoincore.org/en/2024/10/08/disclose-mutated-blocks-hindering-propagation/) | public-cve-replication | | `btc_orphan_cpu` | Bitcoin / Dogecoin / Litecoin · p2p | [CVE-2024-52914](https://bitcoincore.org/en/2024/07/03/disclose-orphan-dos/) | public-cve-replication | | `btc_oversized_recv_buffer` | Bitcoin · p2p | [CVE-2015-3641](https://bitcoincore.org/en/2024/07/03/disclose_receive_buffer_oom/) | public-cve-replication | | `btc_tx_maprelay` | Bitcoin · p2p | [CVE-2013-4627](https://nvd.nist.gov/vuln/detail/CVE-2013-4627) | public-cve-replication | | `btc_tx_quad_sighash` | Bitcoin · p2p | [CVE-2025-46598](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-46598/) | public-cve-replication | | `btc_version_selfnonce` | Bitcoin · p2p | [CVE-2025-54604](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-54604/) | public-cve-replication | | `btc_version_timestamp_overflow` | Bitcoin · p2p | [CVE-2024-52912](https://bitcoincore.org/en/2024/07/03/disclose-timestamp-overflow/) | public-cve-replication | | `p2p_getheaders_drain` | Bitcoin / Dogecoin / Litecoin · p2p | [CVE-2023-33297](https://nvd.nist.gov/vuln/detail/CVE-2023-33297) | public-cve-replication | | `cometbft_bitarray_mismatch` | Cosmos · cometbft-p2p-secretconn | [GHSA-hrhf-2vcr-ghch](https://github.com/cometbft/cometbft/security/advisories/GHSA-hrhf-2vcr-ghch) | public-cve-replication | | `cometbft_blockpart_mismatch` | Cosmos · cometbft-p2p-secretconn | [GHSA-r3r4-g7hq-pq4f](https://github.com/advisories/GHSA-r3r4-g7hq-pq4f) | public-cve-replication | | `cometbft_voteext_panic` | Cosmos · cometbft-p2p-secretconn | [GHSA-p7mv-53f2-4cwj](https://github.com/cometbft/cometbft/security/advisories/GHSA-p7mv-53f2-4cwj) | public-cve-replication | | `cosmos_gogoproto_skippy` | Cosmos · rpc-broadcast-tx | [CVE-2021-3121](https://osv.dev/vulnerability/CVE-2021-3121) | public-cve-replication | | `cosmos_group_divzero_halt` | Cosmos · rpc-group-module-tx | [GHSA-x5vx-95h7-rv4p](https://github.com/cosmos/cosmos-sdk/security/advisories/GHSA-x5vx-95h7-rv4p) | public-cve-replication | | `cosmos_p2p_conn_flood` | Cosmos · tcp-p2p-conn-flood | [CVE-2020-5303](https://github.com/tendermint/tendermint/security/advisories/GHSA-v24h-pjjv-mcp6) | public-cve-replication | | `cosmos_protobuf_nest_bomb` | Cosmos · rpc-broadcast-tx | [GHSA-8wcc-m6j2-qxvm](https://github.com/cosmos/cosmos-sdk/security/advisories/GHSA-8wcc-m6j2-qxvm) | public-cve-replication | | `geth_devp2p_ping_flood` | Ethereum · devp2p-rlpx | [CVE-2023-40591](https://github.com/ethereum/go-ethereum/security/advisories/GHSA-ppjg-v974-84cm) | public-cve-replication | | `geth_eth_receipt_flood` | Ethereum · devp2p-rlpx | [EL-2024-20](https://reports.immunefi.com/ethereum-protocol-or-attackathon/37466-bc-medium-evil-client-oom-crash-fast-p2p-crash) | public-cve-replication | | `geth_getblockheaders_count_zero` | Ethereum · devp2p-rlpx | [CVE-2024-32972](https://github.com/ethereum/go-ethereum/security/advisories/GHSA-4xc9-8hmq-j652) | public-cve-replication | | `geth_rlpx_auth_flood` | Ethereum · devp2p-rlpx | [EL-2026-06](https://notes.ethereum.org/gDWKW5RtSym02t2aGYkmSQ) | public-cve-replication | | `geth_snap_trienode_dos` | Ethereum · devp2p-rlpx | [CVE-2021-41173](https://github.com/ethereum/go-ethereum/security/advisories/GHSA-59hh-656j-3p7v) | public-cve-replication | | `geth_tcp_handshake_flood` | Ethereum · devp2p-rlpx | [EL-2024-06](https://reports.immunefi.com/ethereum-protocol-or-attackathon/37120-bc-insight-remote-handshake-based-tcp-30303-flooding-leads-to-an-out-of-memory-crash) | public-cve-replication | | `monero_levin_array_memcorrupt` | Monero · levin-p2p | [CVE-2018-3972](https://www.talosintelligence.com/vulnerability_reports/TALOS-2018-0637) | public-cve-replication | | `monero_portable_storage_oom` | Monero · levin-p2p | [PR#7190](https://github.com/monero-project/monero/pull/7190) | public-cve-replication | | `monero_rpc_conn_exhaustion` | Monero · http-rpc | [CVE-2025-26819](https://nvd.nist.gov/vuln/detail/CVE-2025-26819) | public-cve-replication | | `sol_tpu_quic_handshake_flood` | Solana · tpu-quic | [ND-FD04-LO-01](https://neodyme.io/reports/Firedancer-v0.4.pdf) | public-cve-replication | | `sol_tpu_quic_initial_cpu` | Solana · tpu-quic | [ND-FD1-MD-02](https://neodyme.io/reports/Firedancer.pdf) | public-cve-replication | | `sol_tpu_quic_slowloris` | Solana · tpu-quic | [ND-FD04-IN-02](https://neodyme.io/reports/Firedancer-v0.4.pdf) | public-cve-replication | | `sui_disassemble_panic` | Sui · json-rpc | [CertiK Skyfall](https://medium.com/certik-skyfall/blockchain-rpc-vulnerabilities-why-memory-safe-blockchain-rpc-nodes-are-not-panic-free-9fbb990115e0) | public-cve-replication | | `sui_move_recursion` | Sui · json-rpc | [CVE-2023-36184](https://nvd.nist.gov/vuln/detail/CVE-2023-36184) | public-cve-replication | | `sui_verifier_hamsterwheel` | Sui · json-rpc | [CertiK Skyfall HamsterWheel](https://medium.com/certik-skyfall/the-hamsterwheel-an-in-depth-exploration-of-a-novel-attack-vector-on-the-sui-blockchain-522f80623bc7) | public-cve-replication | | `gossipsub_prune_backoff_overflow` | libp2p · libp2p-gossipsub | [CVE-2026-34219](https://github.com/libp2p/rust-libp2p/security/advisories/GHSA-xqmp-fxgv-xvq5) | public-cve-replication | | `gossipsub_subscribe_flood` | libp2p · libp2p-gossipsub | [CVE-2026-46679](https://github.com/advisories/GHSA-4f8r-922h-2vgv) | public-cve-replication | | `libp2p_signed_peer_record_flood` | libp2p · libp2p-gossipsub | [CVE-2023-40583](https://github.com/advisories/GHSA-gcq9-qqwx-rgj3) | public-cve-replication | | `libp2p_stream_exhaustion` | libp2p · libp2p-gossipsub | [CVE-2022-23492](https://github.com/advisories/GHSA-j7qp-mfxf-8xjw) | public-cve-replication | Distribution: **966** `public-cve-replication` attack bundles — **47 distinct primitives across 9 chains** (Bitcoin, Cosmos, Dogecoin, Ethereum, libp2p, Litecoin, Monero, Solana, Sui) — plus **468** benign. This published cut contains **no `original` bundles**; the `original` RPC-measurement primitives live in the full internal corpus and ship only if the operator explicitly opts in, always under their honest label. ## Training procedure (methodology is the contribution) Per NullRabbit's pre-registration discipline: the corpus is built attack-by-attack from a public disclosure with `provenance.public_source`; a Cleanlab data-quality scan gates label-issues and duplicates before training; a methodology auditor reviews each gate event with sanity floors and falsification holdouts; honest limitations are stated; cycles — not the final number — are the contribution. This card + model are regenerated automatically from the cut on each training run, so the numbers below always match the shipped model. ## Evaluation Diagnostic ML checks (the corpus of faithfully-modelled public attacks is the deliverable; these are secondary). Reproduced by `scripts/known_class_loco_eval.py` + `scripts/corpus_quality.py`. - **Within-corpus held-out — binary attack-vs-benign ROC-AUC, GroupKFold by primitive: 0.9533.** `corpus_sha256 known-class-v10-publiccve`. - **Leave-one-chain-out — binary ROC-AUC (HARD zero-shot transfer, *not* a deployment metric):** Dogecoin 1.000 / Litecoin 1.000 / Sui 1.000 / Ethereum 0.996 / Solana 0.969 / Cosmos 0.964 / Bitcoin 0.888 / libp2p 0.832 / Monero 0.678. Chains with few public-CVE primitives have the fewest cross-chain near-neighbours; the companion [`nr-bundles-public`](https://huggingface.co/datasets/NullRabbit/nr-bundles-public) dataset card reports the *stricter* held-out-chain 7-class family macro-F1 (0.17 Sui / 0.35 Solana vs ~0.14 floor). Reported honestly, not a deployment claim. - **Leave-one-attack-primitive-out within Bitcoin (leak-clean disjoint-benign):** all Bitcoin primitives ≥ 0.988. Detection is on traffic *shape*, not deep wire-semantics. ## Intended uses Research and benchmarking of network/resource-abuse detection on blockchain infrastructure; a worked, public-provenance reference corpus; downstream training. **Not** a turnkey production IDS. ## Limitations - **Synthetic lab fidelity** — generated localnet traffic, not a real-world deployment claim. A deployment claim needs a real-traffic validation gate (real mainnet RPC + real attack instances). - **Detection is on traffic *shape*** (volume / rate / size / connection-churn), not deep wire semantics — adequate for these volumetric/crash DoS classes; it would not separate two attacks with identical traffic profiles. - **No host-load features** (root-owned container). - **This is the public-CVE cut** — every shipped attack class reproduces an external public disclosure. The `original` RPC-amplification measurements (vendor-acknowledged but not CVE-backed) are **excluded** from this model; they exist in the full internal corpus and ship only on explicit operator opt-in. ## How to use ```python from predict import load, predict model = load("model.joblib") out = predict(model, [{"pcap.packet_rate": 850.0, "resp.amp_ratio_max": 224.0}]) # -> [{"scoreable": True, "score": ..., "verdict": "attack"|"benign", "threshold": 0.5}] ``` Run `python inference_example.py` for a worked example on real captured vectors. ## Licensing Apache-2.0 (see `LICENSE`). Attribution appreciated. ## Citation ```bibtex @software{nullrabbit_network_known_class_2026, author = {NullRabbit Labs}, title = {nr-network-known-class-detector: a public-provenance blockchain network-attack detector}, year = {2026}, url = {https://huggingface.co/NullRabbit/nr-network-known-class-detector} } ``` Related: the open **bundle format** (`nr-bundle-spec`), the **family taxonomy** (mechanism-defined), the **earned-autonomy framework** ([Zenodo 10.5281/zenodo.18406828](https://doi.org/10.5281/zenodo.18406828)), the dataset `NullRabbit/nr-bundles-public`, and [nullrabbit.ai](https://nullrabbit.ai). ## Contact NullRabbit Labs — [huggingface.co/NullRabbit](https://huggingface.co/NullRabbit) · [nullrabbit.ai](https://nullrabbit.ai)