zerooneresearch
/

predictlm-mini-13m

@@ -15,11 +15,7 @@ tags:
 metrics:
   - accuracy
   - r2
-co2_eq_emissions:
-  emissions: 700
-  source: estimated from Azure EU-North grid factor (~0.3 kg CO₂/kWh) and training compute footprint
-  training_type: distillation
-  geographical_location: Netherlands
 model-index:
   - name: predictlm-mini-13m
     results:
@@ -43,6 +39,26 @@ model-index:
           - type: r2
             value: 0.551
             name: mean R² (n=13, seed=42, fair-set n_features ≤ 128)
 ---
 # predictlm-mini-13m
@@ -185,11 +201,7 @@ Mini was trained via **warm-start sliced distillation**: a novel recipe for comp
 The critical insight: distillation from scratch (Option A in our experiments) **failed to transfer to real OpenML data** — student matched teacher on synthetic but couldn't generalize. Warm-start sliced distillation (Option B, this release) succeeded because the student inherits the teacher's transfer ability as the starting point; distillation only needs to refine.
-### Compute
-- **Carbon**: ~0.7 kg CO₂ (Azure EU-North grid)
-Mini is the **cheapest tabular foundation model release on Hugging Face** by training cost as of 2026-05-14. Reproducible from scratch with `scripts/train_v11_06_tiny.py` in the code repo.
 ## Intended use, limitations, ethical considerations
@@ -200,7 +212,7 @@ Identical to [predictlm-base-26m](https://huggingface.co/zerooneresearch/predict
 - **No personal data in training**: distilled from Base, which was trained on synthetic priors + cleared real-data copulas. No raw eval-set rows seen.
 - **Bias inheritance**: predictions reflect the labeled context the user supplies at inference time
-The known weaknesses (cls below XGBoost; below TabPFN-2.5 / TabICLv2 on both axes) are inherited from Base; Mini does not amplify them but cannot fix them either. Closing the cls gap is targeted in v11.0.6 (Muon + QASSMax + mixed prior).
 ## Reproducibility

 metrics:
   - accuracy
   - r2
+base_model: zerooneresearch/predictlm-base-26m
 model-index:
   - name: predictlm-mini-13m
     results:
           - type: r2
             value: 0.551
             name: mean R² (n=13, seed=42, fair-set n_features ≤ 128)
+      - task:
+          type: tabular-classification
+          name: Tabular Classification (Duo + TTT recipe)
+        dataset:
+          type: openml
+          name: Locked OpenML eval (CC-18 + AMLB + TabPFN-extras), fair-set n_features ≤ 128
+        metrics:
+          - type: accuracy
+            value: 0.751
+            name: mean accuracy with Duo + TTT recipe (Mini + Base + test-time training)
+      - task:
+          type: tabular-regression
+          name: Tabular Regression (Duo + TTT recipe)
+        dataset:
+          type: openml
+          name: Locked OpenML eval (CTR-23 + AMLB), fair-set n_features ≤ 128
+        metrics:
+          - type: r2
+            value: 0.609
+            name: mean R² with Duo + TTT recipe (Mini + Base + test-time training)
 ---
 # predictlm-mini-13m
 The critical insight: distillation from scratch (Option A in our experiments) **failed to transfer to real OpenML data** — student matched teacher on synthetic but couldn't generalize. Warm-start sliced distillation (Option B, this release) succeeded because the student inherits the teacher's transfer ability as the starting point; distillation only needs to refine.
+Reproducible from scratch with `scripts/train_v11_06_tiny.py` in the code repo.
 ## Intended use, limitations, ethical considerations
 - **No personal data in training**: distilled from Base, which was trained on synthetic priors + cleared real-data copulas. No raw eval-set rows seen.
 - **Bias inheritance**: predictions reflect the labeled context the user supplies at inference time
+The known weaknesses (cls below XGBoost; below TabPFN-2.5 / TabICLv2 on both axes) are inherited from Base; Mini does not amplify them but cannot fix them either.
 ## Reproducibility