Improve model card: Add pipeline tag, paper/project links, authors & full content

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +126 -36
README.md CHANGED
@@ -1,7 +1,12 @@
1
  ---
2
  license: mit
 
3
  ---
4
 
 
 
 
 
5
  <div align="center">
6
  <img src="logo.png" alt="Orion-BiX Logo" width="700"/>
7
  </div>
@@ -10,7 +15,7 @@ license: mit
10
  <a href="https://lexsi.ai/">
11
  <img src="https://img.shields.io/badge/Lexsi-Homepage-FF6B6B?style=for-the-badge" alt="Homepage"/>
12
  </a>
13
- <a href="https://huggingface.co/Lexsi">
14
  <img src="https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Lexsi AI-FFD21E?style=for-the-badge" alt="Hugging Face"/>
15
  </a>
16
  <a href="https://discord.gg/dSB62Q7A">
@@ -21,52 +26,58 @@ license: mit
21
  </a>
22
  </div>
23
 
 
 
 
24
 
 
25
 
26
- # Orion-BiX: Bi-Axial Meta-Learning Model for Tabular In-Context Learning
27
-
28
- **Orion-BiX** is an advanced tabular foundation model that combines **Bi-Axial Attention** with **Meta-Learning** capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training.
29
 
30
- The model is part of **Orion**, a family of tabular foundation models with various enhancements.
31
 
32
  ### Key Innovations
33
 
34
- 1. **Bi-Axial Attention**: Alternating attention patterns (Standard β†’ Grouped β†’ Hierarchical β†’ Relational) that capture multi-scale feature interactions within tabular data
35
- 2. **Meta-Learning with k-NN Support Selection**: Episode-based training with intelligent support set selection using similarity metrics
36
- 3. **Three-Component Architecture**: Column embedding (Set Transformer), Bi-Axial row interaction, and In-Context Learning prediction
 
 
 
 
 
37
 
38
- ### Architecture Overview
39
 
40
  ```
41
- Input β†’ tf_col (Set Transformer) β†’ Bi-Axial Attention β†’ tf_icl (ICL) β†’ Output
42
  ```
43
- **Component Details:**
44
- - **tf_col (Column Embedder)**: Set Transformer for statistical distribution learning across features
45
- - **Bi-Axial Attention**: Replaces standard RowInteraction with alternating attention patterns:
46
- - Standard Cross-Feature Attention
47
- - Grouped Feature Attention
48
- - Hierarchical Feature Attention
49
- - Relational Feature Attention
50
- - CLS Token Aggregation
51
- - **tf_icl (ICL Predictor)**: In-context learning module for few-shot prediction
52
 
 
 
 
 
 
 
 
 
53
 
54
- ## Usage
55
-
56
- ```python
57
- from orion_bix.sklearn import OrionBiXClassifier
58
 
59
- # Initialize and use
60
- clf = OrionBiXClassifier()
61
- clf.fit(X_train, y_train)
62
- predictions = clf.predict(X_test)
63
  ```
64
-
65
- This code will automatically download the pre-trained model from Hugging Face and use a GPU if available.
66
 
67
  ## Installation
68
 
 
 
 
 
 
 
69
  ### From the source
 
70
  #### Option 1: From the local clone
71
 
72
  ```bash
@@ -75,19 +86,98 @@ pip install -e .
75
  ```
76
 
77
  #### Option 2: From the Git Remote
78
-
79
  ```bash
80
  pip install git+https://github.com/Lexsi-Labs/Orion-BiX.git
81
  ```
82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ## Citation
84
 
85
- If you use Orion-BiX in your research, please cite:
86
 
87
  ```bibtex
88
- @misc{bouadi25oriobix,
89
- title={Orion-BiX: Bi-Axial Meta-Learning for Tabular In-Context Learning},
90
- author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
91
- year={2025},
 
 
 
 
92
  }
93
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ pipeline_tag: table-question-answering
4
  ---
5
 
6
+ This model is presented in the paper [Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning](https://huggingface.co/papers/2512.00181).
7
+ Authors: Mohamed Bouadi, Pratinav Seth, Aditya Tanna, Vinay Kumar Sankarapu
8
+ Project Page: https://www.lexsi.ai/
9
+
10
  <div align="center">
11
  <img src="logo.png" alt="Orion-BiX Logo" width="700"/>
12
  </div>
 
15
  <a href="https://lexsi.ai/">
16
  <img src="https://img.shields.io/badge/Lexsi-Homepage-FF6B6B?style=for-the-badge" alt="Homepage"/>
17
  </a>
18
+ <a href="https://huggingface.co/Lexsi/Orion-BiX">
19
  <img src="https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Lexsi AI-FFD21E?style=for-the-badge" alt="Hugging Face"/>
20
  </a>
21
  <a href="https://discord.gg/dSB62Q7A">
 
26
  </a>
27
  </div>
28
 
29
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
30
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.2+-orange.svg)](https://pytorch.org/)
31
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
32
 
33
+ # Orion-BiX: Bi-Axial Meta-Learning for Tabular In-Context Learning
34
 
35
+ **[Orion-BiX](https://arxiv.org/abs/2512.00181)** is an advanced tabular foundation model that combines **Bi-Axial Attention** with **Meta-Learning** capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training, achieving state-of-the-art performance on domain-specific benchamrks such as Healthcare and Finance.
 
 
36
 
37
+ ## πŸ—οΈ Approach and Architecture
38
 
39
  ### Key Innovations
40
 
41
+ Orion-BiX introduces three key architectural innovations:
42
+
43
+ 1. **Bi-Axial Attention**: Alternating attention patterns (Standard β†’ Grouped β†’ Hierarchical β†’ Relational) that capture multi-scale feature interactions
44
+ 2. **Meta-Learning**: Episode-based training with k-NN support selection for few-shot learning
45
+ 3. **Configurable Architecture**: Flexible design supporting various attention mechanisms and training modes
46
+ 4. **Production Ready**: Memory optimization, distributed training support, and scikit-learn interface
47
+
48
+ ### Component Details
49
 
50
+ Orion-BiX follows a three-component architecture:
51
 
52
  ```
53
+ Input β†’ Column Embedder (Set Transformer) β†’ Bi-Axial Attention β†’ ICL Predictor β†’ Output
54
  ```
 
 
 
 
 
 
 
 
 
55
 
56
+ 1. **Column Embedder**: Set Transformer for statistical distribution learning across features from TabICL
57
+ 2. **Bi-Axial Attention**: Replaces standard RowInteraction with alternating attention patterns:
58
+ - **Standard Cross-Feature Attention**: Direct attention between features
59
+ - **Grouped Feature Attention**: Attention within feature groups
60
+ - **Hierarchical Feature Attention**: Hierarchical feature patterns
61
+ - **Relational Feature Attention**: Full feature-to-feature attention
62
+ - **CLS Token Aggregation**: Multiple CLS tokens (default: 4) for feature summarization
63
+ 3. **tf_icl ICL Predictor**: In-context learning module for few-shot prediction
64
 
65
+ Each `BiAxialAttentionBlock` applies four attention patterns in sequence:
 
 
 
66
 
 
 
 
 
67
  ```
68
+ Standard β†’ Grouped β†’ Hierarchical β†’ Relational β†’ CLS Aggregation
69
+ ```
70
 
71
  ## Installation
72
 
73
+ ### Prerequisites
74
+
75
+ - Python 3.9-3.12
76
+ - PyTorch 2.2+ (with CUDA support recommended)
77
+ - CUDA-capable GPU (recommended for training)
78
+
79
  ### From the source
80
+
81
  #### Option 1: From the local clone
82
 
83
  ```bash
 
86
  ```
87
 
88
  #### Option 2: From the Git Remote
 
89
  ```bash
90
  pip install git+https://github.com/Lexsi-Labs/Orion-BiX.git
91
  ```
92
 
93
+ ## Usage
94
+
95
+ Orion-BiX provides a scikit-learn compatible interface for easy integration:
96
+
97
+ ```python
98
+ from orion_bix.sklearn import OrionBixClassifier
99
+
100
+ # Initialize and fit the classifier
101
+ clf = OrionBixClassifier()
102
+
103
+ # Fit the model (prepares data transformations)
104
+ clf.fit(X_train, y_train)
105
+
106
+ # Make predictions
107
+ predictions = clf.predict(X_test)
108
+ probabilities = clf.predict_proba(X_test)
109
+ ```
110
+
111
+ ## Preprocessing
112
+
113
+ Orion-BiX includes automatic preprocessing that handles:
114
+
115
+ 1. **Categorical Encoding**: Automatically encodes categorical features using ordinal encoding
116
+ 2. **Missing Value Imputation**: Handles missing values using median imputation for numerical features
117
+ 3. **Feature Normalization**: Supports multiple normalization methods:
118
+ - `"none"`: No normalization
119
+ - `"power"`: Yeo-Johnson power transform
120
+ - `"quantile"`: Quantile transformation to normal distribution
121
+ - `"quantile_rtdl"`: RTDL-style quantile transform
122
+ - `"robust"`: Robust scaling using median and quantiles
123
+ 4. **Outlier Handling**: Clips outliers beyond a specified Z-score threshold (default: 4.0)
124
+ 5. **Feature Permutation**: Applies systematic feature shuffling for ensemble diversity:
125
+ - `"none"`: Original feature order
126
+ - `"shift"`: Circular shifting
127
+ - `"random"`: Random permutation
128
+ - `"latin"`: Latin square patterns (recommended)
129
+
130
+ The preprocessing is automatically applied during `fit()` and `predict()`, so no manual preprocessing is required.
131
+
132
+ ## Performance
133
+
134
+ <div align="center">
135
+ <img src="figures/accuracy_ranking_talent.png" alt="Accuracy Ranking TALENT" width="700"/>
136
+ </div>
137
+
138
+ <div align="center">
139
+ <img src="figures/accuracy_ranking_tabzilla.png" alt="Accuracy Ranking TabZilla" width="700"/>
140
+ </div>
141
+
142
+ <div align="center">
143
+ <img src="figures/accuracy_ranking_openml-cc18.png" alt="Accuracy Ranking OPENML-CC18" width="700"/>
144
+ </div>
145
+
146
+ <div align="center">
147
+ <table>
148
+ <tr>
149
+ <td style="padding: 5px;"><img src="figures/relative_acc_improvement_over_tabzilla.png" alt="Relative Improvement over XGBoost on TabZilla" width="700"/></td>
150
+ </tr>
151
+ </table>
152
+ </div>
153
+
154
  ## Citation
155
 
156
+ If you use Orion-BiX in your research, please cite our [paper](https://arxiv.org/abs/2512.00181):
157
 
158
  ```bibtex
159
+ @article{bouadi2025orionbix,
160
+ title={Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning},
161
+ author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
162
+ year={2025},
163
+ eprint={2512.00181},
164
+ archivePrefix={arXiv},
165
+ primaryClass={cs.LG},
166
+ url={https://arxiv.org/abs/2512.00181},
167
  }
168
+ ```
169
+
170
+ ## License
171
+
172
+ This project is released under the MIT License. See [LICENSE](LICENSE) for details.
173
+
174
+ ## Contact
175
+
176
+ For questions, issues, or contributions, please:
177
+ - Open an issue on [GitHub](https://github.com/Lexsi-Labs/Orion-BiX/issues)
178
+ - Join our [Discord](https://discord.gg/dSB62Q7A) community
179
+
180
+
181
+ ## πŸ™ Acknowledgments
182
+
183
+ Orion-BiX is built on top of [TabICL](https://github.com/soda-inria/tabicl), a tabular foundation model for in-context learning. We gratefully acknowledge the TabICL authors for their foundational work and for making their codebase publicly available.