| --- |
| license: mit |
| library_name: xgboost |
| pipeline_tag: tabular-regression |
| tags: |
| - tabular-regression |
| - ens |
| - ethereum |
| - web3 |
| - domain-names |
| - price-prediction |
| - nft |
| datasets: |
| - quantumly/ens-appraiser-data |
| base_model: sentence-transformers/all-mpnet-base-v2 |
| metrics: |
| - r_squared |
| - mape |
| - rmse |
| model-index: |
| - name: ENS Appraiser v0.2 |
| results: |
| - task: |
| type: tabular-regression |
| name: ENS Domain Price Prediction |
| dataset: |
| name: ENS Appraiser Multi-source Training Data |
| type: quantumly/ens-appraiser-data |
| metrics: |
| - type: r_squared |
| value: 0.3081 |
| name: R² (log USD, test) |
| - type: median_ape |
| value: 1.383 |
| name: Median APE (test) |
| - type: rmse |
| value: 1.5469 |
| name: RMSE (log USD, test) |
| --- |
| |
| # ENS Appraiser v0.2 |
|
|
| A gradient-boosted regressor that predicts the USD sale price of an |
| ENS (`.eth`) domain name from on-chain history, semantic embeddings of the |
| label, and macro-market context. |
|
|
| This is the **v0 baseline** — handcrafted features + mpnet PCA + KNN |
| comparable-sale aggregates. Built to establish an honest, leakage-free |
| floor that future versions improve on. |
|
|
| ## Quick numbers |
|
|
| Trained on ~265k ENS secondary sales (Jan 2022 – Sep 2023), evaluated on |
| 2,744 sales in **Q1–Q2 2024** (held out by date, never seen during training): |
|
|
| | Split | n | R² (log USD) | RMSE (log USD) | Median APE | Bias | |
| |-------|--------|--------------|----------------|------------|-------| |
| | Train | 265,240 | 0.7700 | 0.7744 | 32.5% | +0.000 | |
| | Val | 3,545 | 0.6602 | 1.0678 | 57.0% | +0.203 | |
| | Test | 2,744 | **0.3081** | 1.5469 | 138.3% | +0.732 | |
|
|
| **Plain-English read:** for a typical mid-tier name in test, the model is |
| within ~2× of the actual sale price. The long tail — celebrity names, |
| 3-letter premiums, regime shifts — is where it misses, often by 100×+ in |
| either direction. |
|
|
| ## What's good |
|
|
| - **Mid-tier names, $50–$5,000 range:** usually within 2× of actual. |
| - **Length and character composition:** strong signals captured well. |
| The model knows 3-letter ASCII names are premium and 12-letter random |
| handles are cheap. |
| - **Wordlist hits:** matches against Wikipedia, GeoNames, US first names, |
| stock tickers, and SEC EDGAR are picked up correctly. `paris.eth` is |
| flagged as a city, `nike.eth` as a brand. |
| - **Comparable-sale anchoring:** the top two features are `knn_mean_log` |
| and `knn_p90_log` — the model leans heavily on "what did similar names |
| sell for recently?" which is the right intuition for valuation. |
|
|
| ## What's not |
|
|
| - **Celebrity / brand premium:** a name's value to a known buyer |
| (Coinbase wanting `coinbase.eth`, a luxury brand wanting their mark) |
| is invisible to this model. It can detect that `nike.eth` is a brand |
| word, but not that the sale price reflects Nike's interest specifically. |
| - **3-letter premium tail:** names like `mph.eth`, `uma.eth` sold for |
| $20k–$40k in test; the model predicted $100–$200. The training set |
| underweights short premiums because most sales there are 5+ letters. |
| - **Regime shift on test:** test set median price is ~4× higher than |
| training median due to the 2023 → 2024 ENS market shift. Recency-weighted |
| training (1-year half-life) helps but doesn't fully close the gap. |
| - **Bidirectional errors:** worst predictions split roughly evenly |
| between under-prediction (hot names the model didn't recognize) and |
| over-prediction (cold names that just didn't move). 138% MedianAPE is |
| honest but uncomfortable. |
|
|
| ## How it's built |
|
|
| | Component | Detail | |
| |---|---| |
| | Algorithm | XGBoost regressor (170 boosted trees, max_depth=7) | |
| | Target | `log(sale_price_usd)` | |
| | Features | 146 total | |
| | Training data | 265,240 sales, Jan 2022 – Sep 2023 | |
| | Training time | ~10 min on a single A100 | |
| | Model size | 3.3 MB | |
| |
| ### Feature breakdown |
| |
| - **Handcrafted (15):** length, n_digits, n_letters, n_special, palindrome, |
| is_all_digits, is_all_letters, is_ascii, has_unicode, starts/ends_digit, |
| max_char_run, n_unique_chars |
| - **Wordlist hits (8):** Wikipedia titles, GeoNames cities, US first names, |
| ISO 3166 countries, stock tickers, SEC EDGAR companies, Wiktionary EN, |
| plus a `wordlist_hits` total |
| - **Grails clubs (~45):** binary membership in each curated `.eth` club |
| (`999club`, `pre-punks`, `palindromes`, `pokemon_gen1`, etc.) |
| - **Trademark conflict (1):** active USPTO mark in Nice classes 9, 35, 36, |
| 38, 41, 42, 45 with matching `mark_text_norm` |
| - **Holder behavior (2):** `name_age_days`, `prior_transfer_count` |
| (leakage-safe — only counts transfers strictly before the sale block) |
| - **Macro context (5):** Fear & Greed Index, ETH chain TVL, ETH stablecoin |
| market cap, ETH DEX volume, total NFT marketplace fees on the sale day |
| - **mpnet PCA (64):** 768-dim `all-mpnet-base-v2` embeddings of the label, |
| PCA-reduced to 64 dims (95% explained variance) |
| - **KNN comparable sales (8):** for each label, FAISS-retrieve top-50 |
| semantic neighbors (HNSW index), filter near-duplicates (sim > 0.999), |
| take the most-recent prior sale of each, aggregate as `knn_count`, |
| `knn_mean_log`, `knn_median_log`, `knn_p90_log`, `knn_max_sim`, |
| `knn_min_sim`, `knn_log_max`, `knn_log_min`. **Strict leakage prevention:** |
| only neighbors with sales **before** the current sale's date count. |
|
|
| ### Top 10 features by gain |
|
|
| | Rank | Feature | Gain | |
| |---:|---|---:| |
| | 1 | `knn_mean_log` | 1,714 | |
| | 2 | `knn_p90_log` | 1,613 | |
| | 3 | `len` | 1,364 | |
| | 4 | `in_wikipedia` | 1,052 | |
| | 5 | `is_all_digits` | 944 | |
| | 6 | `knn_median_log` | 604 | |
| | 7 | `n_digits` | 338 | |
| | 8 | `pca_000` | 289 | |
| | 9 | `n_clubs` | 282 | |
| | 10 | `ends_digit` | 277 | |
|
|
| Five of the top ten are KNN-comp or PCA features, which means the |
| embedding pipeline is doing real work — it's not just paying for itself, |
| it's the dominant signal alongside length. |
|
|
| ## Training data + leakage controls |
|
|
| Built from the [`quantumly/ens-appraiser-data`](https://huggingface.co/datasets/quantumly/ens-appraiser-data) |
| dataset: |
|
|
| - **Sales labels:** Alchemy `getNFTSales` for ENS BaseRegistrar + NameWrapper |
| contracts. Wei amounts converted to USD via CoinGecko hourly OHLC at |
| the sale's block timestamp. **Coverage gap:** Alchemy `getNFTSales` v2 |
| truncates at block 19,768,978 (May 2024) and does not index Blur |
| marketplace sales. v0 ships with this gap; closing it is a v1 priority. |
| - **Registrations + transfers:** The Graph's [ENS subgraph](https://thegraph.com/explorer/subgraphs/5XqPmWe6gjyrJtFn9cLy237i4cWw2j9HcUJEXsP5qGtH). |
| - **Wordlists:** Wiktionary dumps, Wikipedia EN article titles, GeoNames |
| `cities500`, US Census baby names, NASDAQ Trader ticker dumps, |
| SEC EDGAR company tickers, ISO 3166 country list. |
| - **Macro:** alternative.me Fear & Greed Index, DefiLlama (TVL, stablecoin |
| mcap, DEX volume, NFT marketplace fees). |
| - **Trademarks:** USPTO Trademark Case Files Dataset (annual research dump). |
| - **Embeddings:** `sentence-transformers/all-mpnet-base-v2`, encoded once |
| for all 3.5M ENS labels in the dataset. |
|
|
| ### Leakage controls |
|
|
| The first version of this model accidentally leaked future information |
| through `lifetime_transfer_count` (it counted *all* transfers ever for a |
| labelhash, including transfers that happened *after* the sale being |
| predicted). The leaky model showed **train R² 0.81 / test R² −0.29** — the |
| classic catastrophic-overfit signature where the model collapses to |
| predicting the population mean on held-out data. |
|
|
| The current model uses `prior_transfer_count`, which only counts transfers |
| where `transfer_block < sale_block` per row. It moved to rank #11 in |
| feature importance (was #1 by 3.3×). KNN comparable-sale features have a |
| similar safeguard: a neighbor's sale only counts if it happened strictly |
| before the sale being predicted. |
|
|
| ### Train/Val/Test split |
|
|
| Fixed-window temporal split: |
|
|
| - **Train:** sales with `sale_date < 2023-10-01` |
| - **Val:** sales 2023-10-01 → 2023-12-31 |
| - **Test:** sales 2024-01-01 onwards |
|
|
| This prevents the v0.1 mistake of training on 2022 prices and asking the |
| model to extrapolate to a 2024 market regime that's ~4× more expensive |
| on average. Val and test are in the same regime so val RMSE is a |
| meaningful proxy for test. |
|
|
| Training rows are weighted with an exponential recency decay (1-year |
| half-life, normalized to mean=1.0) so the model leans on 2023 dynamics |
| without throwing away the older data entirely. |
|
|
| ## Intended use |
|
|
| This model is intended for **research and analytics**, not as a price |
| oracle and not for live trading. |
|
|
| **Reasonable uses:** |
|
|
| - Bulk valuation of mid-tier ENS portfolios for tax/accounting purposes |
| - Identifying obviously over- or under-listed names on secondary markets |
| - Sanity-checking a listing price before posting |
| - Producing comparable-sale ranges for negotiation context |
|
|
| **Out of scope:** |
|
|
| - Pricing 3-letter, 1-2 letter, or otherwise-premium names with confidence |
| - Pricing celebrity / known-brand names where the buyer pool is concentrated |
| - Predicting prices for names in the post-May-2024 marketplace mix |
| (Blur dominance, marketplace fee changes) |
| - Any high-stakes financial decision based on a single point estimate |
|
|
| ## Limitations |
|
|
| - **Sales coverage**: Jan 2022 – May 2024 only, no Blur. ~2 years of recent |
| sales (mid-2024 onwards) are missing entirely from training. Closing |
| this gap requires either a new sales source (Reservoir/SimpleHash both |
| defunct as of 2024–2025) or direct `eth_getLogs` decoding of Seaport, |
| Blur, X2Y2, LooksRare events, planned for v1. |
| - **Celebrity premium**: there's no feature here for "is this a famous |
| person/place/thing?" beyond Wikipedia-title matching. v1 adds |
| LLM-derived structured features (`fame_score`, `name_kind`, |
| `crypto_relevance`, `brand_collision_risk`) which should close most |
| of this gap. |
| - **Out-of-distribution labels**: pure-digit labels (`0001`), |
| punycode/emoji, and l33tspeak get less benefit from mpnet embeddings |
| since they're out of distribution for the pretrained model. Length and |
| charset features partially compensate. |
| - **Time drift**: the ENS market shifts noticeably every 6–12 months as |
| marketplace dominance, fee structures, and DAO actions move. Predictions |
| on names sold "right now" will lag any regime shift since the training |
| cutoff. |
| - **Test-set thinness**: only 2,744 sales meet the $10 floor and post-Jan-2024 |
| cutoff. The reported test R² has roughly ±0.08 95% CI — useful as a |
| ballpark, not a precise number. |
|
|
| ## How to use |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import xgboost as xgb |
| import pickle |
| |
| model_path = hf_hub_download( |
| repo_id="quantumly/ens-appraiser", |
| filename="v0_appraiser_xgb.json", |
| ) |
| pca_path = hf_hub_download( |
| repo_id="quantumly/ens-appraiser", |
| filename="v0_pca_mpnet.pkl", |
| ) |
| |
| booster = xgb.Booster() |
| booster.load_model(model_path) |
| with open(pca_path, "rb") as f: |
| pca = pickle.load(f) |
| |
| # Inference also requires: |
| # 1. mpnet embedding for the label (sentence-transformers/all-mpnet-base-v2) |
| # 2. Handcrafted/wordlist/club/trademark/holder/macro features |
| # 3. KNN comp lookup against the dataset repo's FAISS index |
| # |
| # A self-contained inference notebook is planned in the dataset repo. |
| ``` |
|
|
| The 146 features expected by the booster are listed in `v0_metadata.json` |
| under `feature_cols`, in the exact order required by `xgb.DMatrix`. |
|
|
| ## Reproducibility |
|
|
| The training notebook ([`v0_appraiser_v2.ipynb`](https://huggingface.co/datasets/quantumly/ens-appraiser-data/blob/main/notebooks/v0_appraiser_v2.ipynb)) |
| runs end-to-end on a Colab A100 high-RAM instance in ~25 minutes: |
|
|
| 1. Downloads all source parquets from the dataset repo |
| 2. Reconstructs USD prices via CoinGecko hourly OHLC join |
| 3. Resolves labels for both BaseRegistrar and NameWrapper sales |
| 4. Computes all features |
| 5. Builds HNSW index for KNN |
| 6. Trains XGBoost with early stopping |
| 7. Saves model + metadata + diagnostics |
| 8. Uploads to this model repo |
|
|
| All randomness is seeded (`seed=42` for XGBoost, PCA, sample weights). |
|
|
| ## Roadmap |
|
|
| **v1 priorities** (in expected R² delta order): |
|
|
| 1. **LLM-derived features** — Llama 3.1 8B local inference over all 3.5M |
| labels, extracting `fame_score`, `name_kind`, `cultural_origin`, |
| `crypto_relevance`, `brand_collision_risk`, plus a description-embedding. |
| Expected delta: +0.05–0.10 test R². |
| 2. **Recent sales backfill** via direct `eth_getLogs` decoding of |
| Seaport / Blur / Wyvern / X2Y2 / LooksRare events. Closes the |
| May 2024 → present coverage gap and adds Blur. Expected delta: |
| +0.03–0.06 test R² and a much bigger test set. |
| 3. **Multi-embedding ensemble** — concatenate mpnet with `bge-base-en-v1.5` |
| and `e5-base-v2`, PCA the joint space. Expected delta: +0.02–0.04. |
| 4. **Cross-encoder reranker** for KNN comps. Expected delta: +0.02–0.03. |
| 5. **Contrastive fine-tuning** of mpnet on price-similarity triplets. |
| Expected delta: +0.03–0.05. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{ens_appraiser_2026, |
| author = {Drobnič, Nejc}, |
| title = {ENS Appraiser v0.2}, |
| year = {2026}, |
| publisher = {Hugging Face}, |
| url = {https://huggingface.co/quantumly/ens-appraiser} |
| } |
| ``` |
|
|
| ## License + contact |
|
|
| MIT. Questions, corrections, pull requests: nejc@nejc.dev |