| RESI miner bundle β USD ONNX + subnet feature_config (14-feature schema) |
| ================================================================================ |
|
|
| A) Training (must match validator: ONNX outputs USD, MAPE vs dollar price) |
|
|
| 1. Default train_real_estate.py exports a **single fused ONNX**: raw features |
| (same order as feature_config) β StandardScaler inside the graph β trees β |
| optional expm1 β USD tensor ``price_usd``. You do **not** need --no-log-target |
| for a valid miner model if you keep default fusion (log1p training + expm1 |
| fused). Input name is ``float_input`` for fused exports. |
|
|
| Alternative β train directly in dollars (no log head in ONNX): |
|
|
| MPLBACKEND=Agg python train_real_estate.py \\ |
| --data training_data.json \\ |
| --catboost \\ |
| --out artifacts_miner_usd \\ |
| --no-log-target |
|
|
| Legacy unfused tree-only ONNX (no scaler / no expm1 in graph): |
|
|
| ... --no-onnx-fusion |
|
|
| Use --all / --xgboost / --lightgbm instead of --catboost if you prefer. |
|
|
| 2. Keep the same feature columns as this bundle: train with DEFAULT redundant |
| dropping (omit --no-drop-redundant) so you have exactly the 14 features |
| listed in miner_submission/feature_config.json. |
|
|
| If you train with --no-drop-redundant (17 columns), regenerate the JSON: |
|
|
| MPLBACKEND=Agg python train_real_estate.py ... --no-drop-redundant \\ |
| --write-miner-feature-config miner_submission/feature_config.json |
|
|
| B) Export files for chain / HF repo |
|
|
| 3. Copy the chosen ONNX to your repo as model.onnx (e.g. from |
| artifacts_miner_usd/catboost_price_model.onnx). |
|
|
| 4. Commit miner_submission/feature_config.json alongside the model (same |
| feature order as training / encoder). |
|
|
| 5. ONNX input: one float tensor, shape [N, 14], columns in the exact order |
| of "features" in feature_config.json. Fused models use input ``float_input``. |
| Output is typically a single tensor; fused USD models name it ``price_usd`` |
| (take index 0 if your runner binds by position). |
|
|
| C) Validate JSON against subnet rules (optional) |
|
|
| cd /home/RESI-models |
| .venv/bin/python -c " |
| from pathlib import Path |
| from real_estate.data.config_encoder import load_feature_config |
| load_feature_config(Path('/home/46/miner_submission/feature_config.json')) |
| print('feature_config.json OK') |
| " |
|
|
| D) Do not rely on target_transform.json on-chain β the validator does not apply |
| expm1; the model must emit dollars. |
|
|
| E) You cannot change the eval system β submission-only rules |
|
|
| The validator always: encodes raw API fields β float32 matrix (same order as |
| your feature_config) β ONNX β treats outputs as USD for MAPE. |
|
|
| Therefore you MUST NOT depend on any extra JSON, hooks, or server-side |
| preprocessing. Everything the model needs must be inside model.onnx OR you |
| must train without that preprocessing: |
|
|
| β’ Feature normalization (z-score, min-max): only valid if you fuse those ops |
| into the ONNX graph ahead of the trees. Default train_real_estate.py does |
| this (StandardScaler β trees β optional expm1). Training with a sklearn |
| scaler but submitting plain tree ONNX on raw inputs = wrong. |
|
|
| β’ log1p(price) training: only valid if the ONNX output is already USD, i.e. |
| expm1 is in the graph (default fused export) or you use --no-log-target. |
|
|
| β’ For gradient-boosted trees on tabular data, raw features + USD target is |
| usually enough; focus on data and regularization rather than z-score unless |
| you invest in ONNX fusion tools. |
|
|
| Minimum viable submission: model.onnx (raw in β USD out) + feature_config.json |
| matching column order; no other files required by default. |
|
|