| --- |
| license: mit |
| language: en |
| library_name: pytorch |
| tags: |
| - cryptography |
| - elliptic-curve |
| - secp256k1 |
| - bitcoin |
| - negative-result |
| - research |
| pipeline_tag: tabular-classification |
| --- |
| |
| # secp256k1 Parity Prediction β Negative-Result Study |
|
|
| **TL;DR β This is a negative-result artifact.** Five model families were trained to predict the parity of the secp256k1 private key `k` from features of the public-key point `(X, Y)`. All converged to **exactly 50% accuracy** on held-out k's, consistent with the long-held cryptographic belief that even one bit of the discrete logarithm is as hard to recover as the whole key. |
|
|
| > Do **not** use these models for any cryptographic application. They cannot predict the parity of `k`, by construction. |
|
|
| ## Task |
|
|
| Given a secp256k1 public-key point `Q = kΒ·G = (X, Y)`, predict `k mod 2` (whether the underlying private scalar is odd or even). |
|
|
| This is the "least-significant bit" of the discrete logarithm. |
|
|
| ## Theoretical context (carefully stated) |
|
|
| There is a body of work showing that individual bits of the discrete logarithm are "hardcore" β that predicting any single bit with non-negligible advantage implies efficiently solving the full discrete log: |
|
|
| - **Blum & Micali (1984)** β introduced the framework of hardcore predicates and proved the MSB ("which half") of DL in Z*_p is hardcore. |
| - **Long & Wigderson (1988)** β extended this to O(log log p) bits. |
| - **Boneh & Venkatesan (1996)** β hardness of the most significant bits of DiffieβHellman shared secrets. |
| - **HΓ₯stad & NΓ€slund (β2004, J. ACM)** β *"The Security of all RSA and Discrete Log Bits"*. Proved every individual bit of the discrete log in Z*_p (and of the RSA function) is hardcore. |
| |
| These foundational results are for the **multiplicative group** Z*_p (and RSA), **not literally for elliptic curves**. The analogous statement is widely *accepted* and *expected* to hold for ECDLP on curves like secp256k1, and follows from the same proof techniques applied to the ECDLP setting, but a single canonical "ECDLP bit hardness" reference of the same standing as HΓ₯stadβNΓ€slund is harder to point to cleanly. In practice this is the universally held view in the cryptographic community. |
|
|
| The empirical result here is consistent with that belief: no architecture we tried beats 50%. |
|
|
| ## Data |
|
|
| - **Range:** `k = 1 β¦ 1,000,000` (sequential) |
| - **Generator:** the standard Bitcoin generator `G` |
| - **Splits:** sequential by `k` β train 70% (k=1..700K), val 15% (k=700K..850K), holdout 15% (k=850K..1M). |
| - **Inputs:** 44 engineered features of `X` and `Y` (decimal digit statistics, mod residues, bit-length, popcount, comparisons). Target column `k_state` and traceability column `k` are excluded from inputs. |
|
|
| See `feature_cols.json` for the full input schema. |
|
|
| ## Models |
|
|
| | Model | Architecture | Params | Holdout acc | Holdout AUC | Train time | |
| |---|---|---|---|---|---| |
| | Logistic Regression | sklearn baseline | tiny | 0.4982 | 0.4983 | ~3 s | |
| | XGBoost | 400 trees, depth 6 | β | 0.5033 | 0.5036 | ~3 s | |
| | LightGBM | 400 trees, 63 leaves | β | 0.5002 | 0.4998 | ~11 s | |
| | MLP | 4 layers, 512β512β256 | ~1M | 0.4997 | 0.5009 | ~8 s | |
| | **BitTransformer** | 4-layer encoder, d=128, over raw 512-bit (X\|Y) | 859K | **0.5000** | **0.4998** | ~2.8 h | |
| | Permutation sanity (XGBoost on shuffled labels) | β | β | 0.4994 | 0.4994 | β | |
|
|
| All accuracies are statistically indistinguishable from 0.5. The BitTransformer specifically tested the hypothesis that bit-level nested XOR/AND patterns might leak the parity β they don't. |
|
|
| ## Repository structure |
|
|
| ``` |
| . |
| βββ README.md # this file |
| βββ metrics.json # full per-model metrics |
| βββ feature_cols.json # input feature schema |
| βββ data/ |
| β βββ features_sample_1k.parquet # 1K-row sample of training data |
| βββ models/ |
| β βββ xgb.json # XGBoost |
| β βββ lgbm.txt # LightGBM |
| β βββ mlp.pt # PyTorch MLP state_dict |
| β βββ bit_xformer.pt # PyTorch BitTransformer state_dict |
| βββ scripts/ |
| βββ gen.py # generates feature parquet from k range |
| βββ train.py # trains logreg / XGBoost / LightGBM / MLP |
| βββ bit_transformer.py # trains the BitTransformer |
| βββ predict.py # single-point inference |
| βββ batch_eval.py # batch evaluation on fresh k range |
| βββ build_html.py # generates a result-table HTML |
| ``` |
|
|
| ## Reproducing |
|
|
| ```bash |
| # 1. install deps |
| pip install numpy pandas pyarrow scikit-learn xgboost lightgbm torch huggingface_hub |
| |
| # 2. generate features (k=1..1M) |
| python scripts/gen.py 1000000 |
| |
| # 3. train the fast models |
| python scripts/train.py |
| |
| # 4. train the bit-transformer (takes ~3 h on an NVIDIA L4) |
| python scripts/bit_transformer.py |
| ``` |
|
|
| Generation is single-threaded Python big-int arithmetic and is the wall-clock bottleneck (~5 min for 1M points). Training the fast models is seconds; the BitTransformer is hours due to sequence length 513 with 4 attention layers. |
|
|
| ## Loading the models |
|
|
| ```python |
| # XGBoost |
| import xgboost as xgb |
| m = xgb.XGBClassifier(); m.load_model("models/xgb.json") |
| |
| # LightGBM |
| import lightgbm as lgb |
| m = lgb.Booster(model_file="models/lgbm.txt") |
| |
| # MLP |
| import torch, torch.nn as nn |
| D = 44 |
| mlp = nn.Sequential(nn.Linear(D,512), nn.ReLU(), |
| nn.Linear(512,512), nn.ReLU(), |
| nn.Linear(512,256), nn.ReLU(), |
| nn.Linear(256,1)) |
| mlp.load_state_dict(torch.load("models/mlp.pt")) |
| |
| # BitTransformer |
| class BitTransformer(nn.Module): |
| def __init__(self, seq_len=512, d=128, nhead=4, nlayers=4): |
| super().__init__() |
| self.tok = nn.Embedding(2, d) |
| self.pos = nn.Parameter(torch.randn(1, seq_len, d) * 0.02) |
| self.cls = nn.Parameter(torch.randn(1, 1, d) * 0.02) |
| enc = nn.TransformerEncoderLayer(d_model=d, nhead=nhead, dim_feedforward=4*d, |
| batch_first=True, activation="gelu", norm_first=True) |
| self.enc = nn.TransformerEncoder(enc, num_layers=nlayers) |
| self.head = nn.Linear(d, 1) |
| def forward(self, x_bits): |
| h = self.tok(x_bits) + self.pos |
| cls = self.cls.expand(h.size(0), -1, -1) |
| h = torch.cat([cls, h], dim=1) |
| h = self.enc(h) |
| return self.head(h[:, 0, :]).squeeze(1) |
| bx = BitTransformer() |
| bx.load_state_dict(torch.load("models/bit_xformer.pt")) |
| ``` |
|
|
| ## Intended use |
|
|
| - **Research:** confirming empirically that off-the-shelf ML cannot recover the LSB of a secp256k1 private key from its public-key coordinates. |
| - **Education:** demonstrating how features that "look informative" (digit sums, mod residues, popcount) carry zero signal about cryptographic targets. |
| - **Negative control / sanity check** for related ML-cryptanalysis projects. |
|
|
| ## Out-of-scope use |
|
|
| - Any actual cryptographic key-recovery attempt. These models cannot do it; the underlying theory says no efficient model can. |
| - Any conclusion about non-curve-based ciphers or curves other than secp256k1. |
|
|
| ## Hardware & cost |
|
|
| - Trained on a single GCP `g2-standard-8` instance with 1Γ NVIDIA L4 GPU. |
| - Total wall-clock: ~3 hours (mostly the BitTransformer). |
| - Total cost: ~$5 USD. |
|
|
| ## License |
|
|
| MIT. |
|
|
| ## Citation |
|
|
| If you reference this in research, please cite by the repo URL. |
|
|