01RAI commited on
Commit
950834d
·
verified ·
1 Parent(s): 761a46b

PredictLM v11.0 + Mini ship-bundle

Browse files
Files changed (1) hide show
  1. README.md +23 -11
README.md CHANGED
@@ -15,11 +15,7 @@ tags:
15
  metrics:
16
  - accuracy
17
  - r2
18
- co2_eq_emissions:
19
- emissions: 700
20
- source: estimated from Azure EU-North grid factor (~0.3 kg CO₂/kWh) and training compute footprint
21
- training_type: distillation
22
- geographical_location: Netherlands
23
  model-index:
24
  - name: predictlm-mini-13m
25
  results:
@@ -43,6 +39,26 @@ model-index:
43
  - type: r2
44
  value: 0.551
45
  name: mean R² (n=13, seed=42, fair-set n_features ≤ 128)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ---
47
 
48
  # predictlm-mini-13m
@@ -185,11 +201,7 @@ Mini was trained via **warm-start sliced distillation**: a novel recipe for comp
185
 
186
  The critical insight: distillation from scratch (Option A in our experiments) **failed to transfer to real OpenML data** — student matched teacher on synthetic but couldn't generalize. Warm-start sliced distillation (Option B, this release) succeeded because the student inherits the teacher's transfer ability as the starting point; distillation only needs to refine.
187
 
188
- ### Compute
189
-
190
- - **Carbon**: ~0.7 kg CO₂ (Azure EU-North grid)
191
-
192
- Mini is the **cheapest tabular foundation model release on Hugging Face** by training cost as of 2026-05-14. Reproducible from scratch with `scripts/train_v11_06_tiny.py` in the code repo.
193
 
194
  ## Intended use, limitations, ethical considerations
195
 
@@ -200,7 +212,7 @@ Identical to [predictlm-base-26m](https://huggingface.co/zerooneresearch/predict
200
  - **No personal data in training**: distilled from Base, which was trained on synthetic priors + cleared real-data copulas. No raw eval-set rows seen.
201
  - **Bias inheritance**: predictions reflect the labeled context the user supplies at inference time
202
 
203
- The known weaknesses (cls below XGBoost; below TabPFN-2.5 / TabICLv2 on both axes) are inherited from Base; Mini does not amplify them but cannot fix them either. Closing the cls gap is targeted in v11.0.6 (Muon + QASSMax + mixed prior).
204
 
205
  ## Reproducibility
206
 
 
15
  metrics:
16
  - accuracy
17
  - r2
18
+ base_model: zerooneresearch/predictlm-base-26m
 
 
 
 
19
  model-index:
20
  - name: predictlm-mini-13m
21
  results:
 
39
  - type: r2
40
  value: 0.551
41
  name: mean R² (n=13, seed=42, fair-set n_features ≤ 128)
42
+ - task:
43
+ type: tabular-classification
44
+ name: Tabular Classification (Duo + TTT recipe)
45
+ dataset:
46
+ type: openml
47
+ name: Locked OpenML eval (CC-18 + AMLB + TabPFN-extras), fair-set n_features ≤ 128
48
+ metrics:
49
+ - type: accuracy
50
+ value: 0.751
51
+ name: mean accuracy with Duo + TTT recipe (Mini + Base + test-time training)
52
+ - task:
53
+ type: tabular-regression
54
+ name: Tabular Regression (Duo + TTT recipe)
55
+ dataset:
56
+ type: openml
57
+ name: Locked OpenML eval (CTR-23 + AMLB), fair-set n_features ≤ 128
58
+ metrics:
59
+ - type: r2
60
+ value: 0.609
61
+ name: mean R² with Duo + TTT recipe (Mini + Base + test-time training)
62
  ---
63
 
64
  # predictlm-mini-13m
 
201
 
202
  The critical insight: distillation from scratch (Option A in our experiments) **failed to transfer to real OpenML data** — student matched teacher on synthetic but couldn't generalize. Warm-start sliced distillation (Option B, this release) succeeded because the student inherits the teacher's transfer ability as the starting point; distillation only needs to refine.
203
 
204
+ Reproducible from scratch with `scripts/train_v11_06_tiny.py` in the code repo.
 
 
 
 
205
 
206
  ## Intended use, limitations, ethical considerations
207
 
 
212
  - **No personal data in training**: distilled from Base, which was trained on synthetic priors + cleared real-data copulas. No raw eval-set rows seen.
213
  - **Bias inheritance**: predictions reflect the labeled context the user supplies at inference time
214
 
215
+ The known weaknesses (cls below XGBoost; below TabPFN-2.5 / TabICLv2 on both axes) are inherited from Base; Mini does not amplify them but cannot fix them either.
216
 
217
  ## Reproducibility
218