secp256k1 Parity Prediction — Negative-Result Study

TL;DR — This is a negative-result artifact. Five model families were trained to predict the parity of the secp256k1 private key k from features of the public-key point (X, Y). All converged to exactly 50% accuracy on held-out k's, consistent with the long-held cryptographic belief that even one bit of the discrete logarithm is as hard to recover as the whole key.

Do not use these models for any cryptographic application. They cannot predict the parity of k, by construction.

Task

Given a secp256k1 public-key point Q = k·G = (X, Y), predict k mod 2 (whether the underlying private scalar is odd or even).

This is the "least-significant bit" of the discrete logarithm.

Theoretical context (carefully stated)

There is a body of work showing that individual bits of the discrete logarithm are "hardcore" — that predicting any single bit with non-negligible advantage implies efficiently solving the full discrete log:

Blum & Micali (1984) — introduced the framework of hardcore predicates and proved the MSB ("which half") of DL in Z*_p is hardcore.
Long & Wigderson (1988) — extended this to O(log log p) bits.
Boneh & Venkatesan (1996) — hardness of the most significant bits of Diffie–Hellman shared secrets.
Håstad & Näslund (≈2004, J. ACM) — "The Security of all RSA and Discrete Log Bits". Proved every individual bit of the discrete log in Z*_p (and of the RSA function) is hardcore.

These foundational results are for the multiplicative group Z*_p (and RSA), not literally for elliptic curves. The analogous statement is widely accepted and expected to hold for ECDLP on curves like secp256k1, and follows from the same proof techniques applied to the ECDLP setting, but a single canonical "ECDLP bit hardness" reference of the same standing as Håstad–Näslund is harder to point to cleanly. In practice this is the universally held view in the cryptographic community.

The empirical result here is consistent with that belief: no architecture we tried beats 50%.

Data

Range: k = 1 … 1,000,000 (sequential)
Generator: the standard Bitcoin generator G
Splits: sequential by k — train 70% (k=1..700K), val 15% (k=700K..850K), holdout 15% (k=850K..1M).
Inputs: 44 engineered features of X and Y (decimal digit statistics, mod residues, bit-length, popcount, comparisons). Target column k_state and traceability column k are excluded from inputs.

See feature_cols.json for the full input schema.

Models

Model	Architecture	Params	Holdout acc	Holdout AUC	Train time
Logistic Regression	sklearn baseline	tiny	0.4982	0.4983	~3 s
XGBoost	400 trees, depth 6	—	0.5033	0.5036	~3 s
LightGBM	400 trees, 63 leaves	—	0.5002	0.4998	~11 s
MLP	4 layers, 512–512–256	~1M	0.4997	0.5009	~8 s
BitTransformer	4-layer encoder, d=128, over raw 512-bit (X\|Y)	859K	0.5000	0.4998	~2.8 h
Permutation sanity (XGBoost on shuffled labels)	—	—	0.4994	0.4994	—

All accuracies are statistically indistinguishable from 0.5. The BitTransformer specifically tested the hypothesis that bit-level nested XOR/AND patterns might leak the parity — they don't.

Repository structure

.
├── README.md                  # this file
├── metrics.json               # full per-model metrics
├── feature_cols.json          # input feature schema
├── data/
│   └── features_sample_1k.parquet  # 1K-row sample of training data
├── models/
│   ├── xgb.json               # XGBoost
│   ├── lgbm.txt               # LightGBM
│   ├── mlp.pt                 # PyTorch MLP state_dict
│   └── bit_xformer.pt         # PyTorch BitTransformer state_dict
└── scripts/
    ├── gen.py                 # generates feature parquet from k range
    ├── train.py               # trains logreg / XGBoost / LightGBM / MLP
    ├── bit_transformer.py     # trains the BitTransformer
    ├── predict.py             # single-point inference
    ├── batch_eval.py          # batch evaluation on fresh k range
    └── build_html.py          # generates a result-table HTML

Reproducing

# 1. install deps
pip install numpy pandas pyarrow scikit-learn xgboost lightgbm torch huggingface_hub

# 2. generate features (k=1..1M)
python scripts/gen.py 1000000

# 3. train the fast models
python scripts/train.py

# 4. train the bit-transformer (takes ~3 h on an NVIDIA L4)
python scripts/bit_transformer.py

Generation is single-threaded Python big-int arithmetic and is the wall-clock bottleneck (~5 min for 1M points). Training the fast models is seconds; the BitTransformer is hours due to sequence length 513 with 4 attention layers.

Loading the models

# XGBoost
import xgboost as xgb
m = xgb.XGBClassifier(); m.load_model("models/xgb.json")

# LightGBM
import lightgbm as lgb
m = lgb.Booster(model_file="models/lgbm.txt")

# MLP
import torch, torch.nn as nn
D = 44
mlp = nn.Sequential(nn.Linear(D,512), nn.ReLU(),
                    nn.Linear(512,512), nn.ReLU(),
                    nn.Linear(512,256), nn.ReLU(),
                    nn.Linear(256,1))
mlp.load_state_dict(torch.load("models/mlp.pt"))

# BitTransformer
class BitTransformer(nn.Module):
    def __init__(self, seq_len=512, d=128, nhead=4, nlayers=4):
        super().__init__()
        self.tok = nn.Embedding(2, d)
        self.pos = nn.Parameter(torch.randn(1, seq_len, d) * 0.02)
        self.cls = nn.Parameter(torch.randn(1, 1, d) * 0.02)
        enc = nn.TransformerEncoderLayer(d_model=d, nhead=nhead, dim_feedforward=4*d,
                                         batch_first=True, activation="gelu", norm_first=True)
        self.enc = nn.TransformerEncoder(enc, num_layers=nlayers)
        self.head = nn.Linear(d, 1)
    def forward(self, x_bits):
        h = self.tok(x_bits) + self.pos
        cls = self.cls.expand(h.size(0), -1, -1)
        h = torch.cat([cls, h], dim=1)
        h = self.enc(h)
        return self.head(h[:, 0, :]).squeeze(1)
bx = BitTransformer()
bx.load_state_dict(torch.load("models/bit_xformer.pt"))

Intended use

Research: confirming empirically that off-the-shelf ML cannot recover the LSB of a secp256k1 private key from its public-key coordinates.
Education: demonstrating how features that "look informative" (digit sums, mod residues, popcount) carry zero signal about cryptographic targets.
Negative control / sanity check for related ML-cryptanalysis projects.

Out-of-scope use

Any actual cryptographic key-recovery attempt. These models cannot do it; the underlying theory says no efficient model can.
Any conclusion about non-curve-based ciphers or curves other than secp256k1.

Hardware & cost

Trained on a single GCP g2-standard-8 instance with 1× NVIDIA L4 GPU.
Total wall-clock: ~3 hours (mostly the BitTransformer).
Total cost: ~$5 USD.

License

MIT.

Citation

If you reference this in research, please cite by the repo URL.

Downloads last month: -; Downloads are not tracked for this model. How to track