01RAI commited on
Commit
b5bc9ee
·
verified ·
1 Parent(s): 8d72d4a

PredictLM v11.0 + Mini ship-bundle

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -86,6 +86,8 @@ That's it. On the first `.predict()` call the package silently downloads its par
86
 
87
  **TTT** ([Test-Time Training](https://arxiv.org/abs/2503.11842), grounded in TabPFN-2.5's [recipe](https://arxiv.org/abs/2511.08667)) does ~15 inner Adam steps of self-supervised fine-tuning on the user's in-context examples before predicting. Per-task specialization on top of a generic ICL prior. 19 / 20 datasets improved vs zero-tuning; no dataset regressed by more than 0.006.
88
 
 
 
89
  ## Architecture
90
 
91
  Unified architecture: a shared backbone with two task heads (regression via a 1024-bin BarDistribution, classification via per-task masked softmax). The model auto-detects task type from the dtype of `y_train` and routes through the matching head. One `fit/predict` API for both. This unified framing follows [TabICLv2](https://huggingface.co/papers/2602.11139) (Soda Inria, Feb 2026); the closest non-unified precedent is [TabPFN v2](https://huggingface.co/Prior-Labs/TabPFN-v2-clf), which ships separate classifier and regressor checkpoints.
 
86
 
87
  **TTT** ([Test-Time Training](https://arxiv.org/abs/2503.11842), grounded in TabPFN-2.5's [recipe](https://arxiv.org/abs/2511.08667)) does ~15 inner Adam steps of self-supervised fine-tuning on the user's in-context examples before predicting. Per-task specialization on top of a generic ICL prior. 19 / 20 datasets improved vs zero-tuning; no dataset regressed by more than 0.006.
88
 
89
+ PredictLM's TTT is an independent implementation of the published technique. This repo does not include or derive from TabPFN code or weights — PredictLM weights are trained from scratch on synthetic data and shipped under Apache-2.0.
90
+
91
  ## Architecture
92
 
93
  Unified architecture: a shared backbone with two task heads (regression via a 1024-bin BarDistribution, classification via per-task masked softmax). The model auto-detects task type from the dtype of `y_train` and routes through the matching head. One `fit/predict` API for both. This unified framing follows [TabICLv2](https://huggingface.co/papers/2602.11139) (Soda Inria, Feb 2026); the closest non-unified precedent is [TabPFN v2](https://huggingface.co/Prior-Labs/TabPFN-v2-clf), which ships separate classifier and regressor checkpoints.