Benjamin Jaeger commited on
Commit
2be279e
·
1 Parent(s): 93b922c

init README

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: tabpfn-2.6-license-v1.0
4
+ license_link: LICENSE
5
+ extra_gated_fields:
6
+ Organization: text
7
+ Role:
8
+ type: select
9
+ options:
10
+ - Field practitioners
11
+ - Researcher
12
+ - Student
13
+ Use-case: text
14
+ May we contact you about future updates?: checkbox
15
+ extra_gated_button_content: Agree to license terms and send request to access repo.
16
+ extra_gated_description: "Model weights released under\_`tabpfn-2.6-license-v1.0`. This license is designed to be permissive for research and internal evaluation. It *explicitly allows* testing, evaluation, and internal benchmarking, so an organization can download the model and run preliminary assessments on its own datasets.\nThe key restriction is that the model, its derivatives, and its outputs cannot be used for any commercial or production purpose. This includes, but is not limited to, revenue-generating products, competitive benchmarking for procurement, client deliverables, or using the model’s results for internal commercial decision-making.\nFor all production use cases, we offer a *Commercial Enterprise License*. This provides access to our proprietary high-speed inference engine, dedicated support, integration tooling, and other internal models. Please contact us at sales@priorlabs.ai for commercial licensing inquiries."
17
+ pipeline_tag: tabular-classification
18
+ tags:
19
+ - chemistry
20
+ - biology
21
+ - finance
22
+ - legal
23
+ - climate
24
+ - medical
25
+ ---
26
+ ### Model Overview
27
+ TabPFN-2.6 is a transformer-based foundation model that uses in-context-learning to solve tabular prediction problems in a forward pass.
28
+ Inference code can be found at [https://github.com/PriorLabs/tabPFN](https://github.com/PriorLabs/tabPFN).
29
+
30
+ ### Getting started
31
+ First, install the inference package:
32
+ ```{bash}
33
+ pip install tabpfn
34
+ ```
35
+
36
+ Fitting a classifier and predicting looks like this:
37
+
38
+ ```{python}
39
+ from sklearn.datasets import load_breast_cancer
40
+ from sklearn.model_selection import train_test_split
41
+ from tabpfn import TabPFNClassifier
42
+
43
+ # Load data
44
+ X, y = load_breast_cancer(return_X_y=True)
45
+ X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)
46
+
47
+ # Initialize a classifier
48
+ clf = TabPFNClassifier()
49
+ clf.fit(X_train, y_train)
50
+
51
+ # Predict probabilities
52
+ prediction_probabilities = clf.predict_proba(X_test)
53
+ # Predict labels
54
+ predictions = clf.predict(X_test)
55
+ print("Accuracy", accuracy_score(y_test, predictions))
56
+ ```
57
+
58
+ For more examples (e.g. how to train a regressor), see the github repo: [https://github.com/PriorLabs/tabPFN](https://github.com/PriorLabs/tabPFN)!
59
+
60
+ ### Developers & Affiliations
61
+ Developed by Prior Labs.
62
+
63
+ ### Intended Use
64
+ Regression and classification tasks with ≤50 000 samples and ≤2000 features in structured tabular format.
65
+
66
+ ### Not Intended Use
67
+ - Not suitable for unstructured data (text, images); use API version for textual features.
68
+ - Not tested for >50 000 samples or > 2000 features.
69
+
70
+ ### Model Architecture
71
+ Transformer with TabPFNv2-like alternating attention with 24 layers.
72
+
73
+ ### Training Data and Priors
74
+ TabPFN-2.6 is trained purely on synthetic tabular tasks.
75
+
76
+ ### Performance Benchmarks
77
+ Evaluated on proprietary benchmark collection, TabArena, and RealCause (for a causal version), in each of which it yields new SOTA results.
78
+
79
+
80
+ ### Ethical Considerations
81
+ Having been trained purely on synthetic datasets, TabPFN-2.6 is free from dataset leakage from the pretraining stage.
82
+ However, like for any other tabular prediction method, when applied to high-risk use cases, users should ensure that the labelled data is free of biases.
83
+
84
+ ### Limitations
85
+ Performance can degrade when applied to >50000 data points and/or 2000 features.
86
+
87
+ ### Licensing
88
+ Model weights released under tabpfn-2.6-license-v1.0.
89
+
90
+ The license is designed to be permissive for research and limited internal evaluation. It *explicitly allows* testing, evaluation, and internal benchmarking, so an organization can download the model and run preliminary assessments on its own datasets.
91
+ The key restriction is that the model, its derivatives, and its outputs cannot be used for any commercial or production purpose. This includes, but is not limited to, revenue-generating products, competitive benchmarking for procurement, client deliverables, or using the model’s results for internal commercial decision-making.
92
+ For all production use cases, we offer a *Commercial Enterprise License*. This provides access to our proprietary high-speed inference engine, dedicated support, integration tooling, and other internal models.
93
+ Please contact us at sales@priorlabs.ai for commercial licensing inquiries.
94
+
95
+ ### Version
96
+ v1.0: initial release.
97
+
98
+ ### Citation
99
+ ```
100
+ @misc{TabPFN-2.5,\
101
+       title={TabPFN-2.5},\
102
+       author={Léo Grinsztajn and Klemens Flöge and Oscar Key and Felix Birkel and Brendan Roof and Phil Jund and Benjamin Jäger and Adrian Hayler and Dominik Safaric and Simone Alessi, Felix Jablonski and Mihir Manium and Rosen Yu and Anurag Garg and Jake Robertson and Shi Bin (Liam) Hoo and Vladyslav Moroshan and Magnus Bühler and Lennart Purucker and Clara Cornu and Lilly Charlotte Wehrhahn and Alessandro Bonetto and Sauraj Gambhir and Noah Hollmann and Frank Hutter},\
103
+       year={2025}\
104
+ }
105
+ ```